Copyright 2001, J. Touch USC/ISI. All rights reserved. Nov. 19, 20011 An Architecture for Virtual...

60
Copyright 2001, J. Touch USC/ISI. All rights reserved. Nov. 19, 2001 1 An Architecture for Virtual Internets Joe Touch Director, Postel Center for Experimental Networking Computer Networks Division USC/ISI

Transcript of Copyright 2001, J. Touch USC/ISI. All rights reserved. Nov. 19, 20011 An Architecture for Virtual...

Copyright 2001, J. Touch USC/ISI. All rights reserved.Nov. 19, 2001 1

An Architecture for Virtual Internets

Joe TouchDirector, Postel Center for Experimental NetworkingComputer Networks DivisionUSC/ISI

Nov. 19, 2001 2Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Outline

Background Architecture Projects

X-Bone – FreeBSD/Linux tool to deploy VIs for experiments, testbeds, and lab classes

DynaBone – applying layered Vis for fault tolerance and DDOS resistance

NetFS – OS extension of network control and access API to support concurrent VIs

Nov. 19, 2001 3Copyright 2001, J. Touch, USC/ISI. All rights reserved.

– What is a VI? –

A network using encapsulation-based links

A way to test new protocols A way to share infrastructure

A way to virtualize a network topology as VM is to memory

Nov. 19, 2001 4Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Concurrent VIs

Star-ovl

Ring-ovl

IP BaseNetwork

Nov. 19, 2001 5Copyright 2001, J. Touch, USC/ISI. All rights reserved.

User’s view of Vis

star-ovl

AB

DC

ring-ovl

AB

DC

IP Base

AB

DC

Nov. 19, 2001 6Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Uses of VIs

Increase sharing Concurrent use Partition resources Deploy peer services, test protocols

Simplify views of a complex structure Hierarchies: layering (recursion),

divide-and-conquer, embedding Increase portability

Indirection allows remapping Remapping for fault tolerance, mobility

Nov. 19, 2001 7Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Virtual Systems

Logical version of a real, physical resource Virtual memory

Larger space Via map memory onto hard disk

Virtual machine Emulated PCs (VMware), portable code (P-mch,

JVM) Via emulation of PC

Virtual circuit Multiple connections over a single path Via packet swithing and end-to-end state

Nov. 19, 2001 8Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Network Virtualizations

Wire -> virtual wire (packets) Share links, provide fault tolerance

NIC -> VIF Emulate multiple end systems

LAN -> VLAN Share switching / bridging resources

VPNs and overlays Emulate and share the entire network

Nov. 19, 2001 9Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Challenges of VIs

Extension of the Internet architecture Compatible, incrementally-deployable

Scalable deployment and management Divide-and-conquer, merge, split

Inter-VI access Access to services across VIboundaries

‘the Graph Embedding problem’ Optimization, fault tolerance can be hard

Nov. 19, 2001 10Copyright 2001, J. Touch, USC/ISI. All rights reserved.

– VI Architecture –

Multihoming Multiple Internets, not just AS’s Use VIFs and iterative forwarding

Tunneling Weak network layer for endpoint addressing Strong link layer for routing, forwarding control Integrate with dynamic routing, Ipsec

Addressing In the end system, e.g., OS API Naming over the wide area

Nov. 19, 2001 11Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Multihoming

RFC 1122/1123 Host

NICNIC

IP address binds to one NIC

Multihomed Host

NICNIC

IP address binds to each NIC

NICNIC

NICNIC

NICNIC

VNICVNIC VirtualRouter

VirtualRouter

Apps can’t select source IP, no IP w/o NIC

Nov. 19, 2001 12Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Host Implications

Need for an internal router Must participate in routing protocols

Input interface groups Inaddr-any on subsets of interfaces

Output interface selection VIF as source of all traffic

DNS context sensitive replies

Nov. 19, 2001 13Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Router Implications

VI-sensitive forwarding Solve via separate IP spaces

(merge VI-ID with endpoint ID) Intra-VI routing protocols

Solve via admit/exclude rulesamong subsets of interfaces(preprocess gated/mrtd config files)

Nov. 19, 2001 14Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Problem: lost context

Incoming tunnel is context input (de-tunnel, de-IPSEC, demux) forwarding route exchanges

Keep this context retain on decaps. use as context for processing currently via separate IP space later via Overlay ID

Nov. 19, 2001 15Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Dynamic routing

Double encapsulation: Overlay endpoints Overlay link

Supports Multihop overlay (routing within the

overlay) Multiple visits to a single router

Data Ovl Ends Ovl Link Base Inet

Nov. 19, 2001 16Copyright 2001, J. Touch, USC/ISI. All rights reserved.

DE Networking

DATA AD QR XY DATA AD ST YZ

Ovl-DOvl-A

OLink-TOLink-Q

Base-Z

HOST

Base-X

HOST

DATA AD DATA AD

Ovl-C

OLink-S

Ovl-B

OLink-R

Base-Y

ROUTER

Nov. 19, 2001 17Copyright 2001, J. Touch, USC/ISI. All rights reserved.

DE In Action

1. App emits (D)[-,E4]

2. *E routes to VIF1

3. VIF1 adds:

source IP (D)[E1,E4]

‘link’ (D)[E1,E2]+[L1,L2]

4. L2 routes to VIF2

5. VIF2 adds ‘phys’ (D)[E1,E4][L1,L2]+[P1,P2]

6. Internet routes (D)[E1,E4][L1,L2][P1,P2]

6

5

3

1

2

4

Nov. 19, 2001 18Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Parallel tunnels

Multiple paths between two endpoints allows a single node to play more than once in a

single overlay Multiple tunnels

‘Strong’ host model (IP per NIC) Peek-ahead during decapsulation Provides per-tunnel statistics and control

Aliases Susceptible to interface contention Harder to control source address Requires less OS support

Nov. 19, 2001 19Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Multi-tunnel vs. aliasesOvl-

AOlink-Q

Ovl-B Olink-

R

Ovl-COlink-

S

Ovl-D Olink-

T

Base-X Base-Y

Olink-R-Q

Olink-T-S

Olink-Q-R

Olink-S-T

Ovl-A

Ovl-B

Ovl-D

Ovl-C

ST XYDATA CD

DATA AB QR XY

Multi-tunnel:

Which VIF decapsulates?

Base-X Base-Y

Aliases:

One VIF decapsulates both

Packets on the wire (same)

Nov. 19, 2001 20Copyright 2001, J. Touch, USC/ISI. All rights reserved.

HBH IPSEC

Use where E2E not available Secures HBH protocols – routing,

ICMP

Nov. 19, 2001 21Copyright 2001, J. Touch, USC/ISI. All rights reserved.

IPSEC for the overlay

DATA Ovl-Src, Ovl-Dst OLink-Src, OLink-Dst Base-Src, Base-Dst

Application IPSEC(overlay endpoints)

Virtual network IPSEC(overlay links)

Base network IPSEC(base endpoints)

Nov. 19, 2001 22Copyright 2001, J. Touch, USC/ISI. All rights reserved.

V1

V2

Dyn. Routing + IPSEC

Key-per-link interferes with routing

Solve with VIF using IPIP then IPSEC

Tun srcTun dst IPSEC

Tunnel Mode IPSEC

DATA IP srcIP dst

IIPtran

DATA IP srcIP dst

1

Tun src Tun dst IPSEC

2

K1

A

B

C

Z

K2

K2

B

C

Z

K1

A

Nov. 19, 2001 23Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Integrating Routing

Gated / mrtd via gated.conf / mrtd.conf script

processing isolate RIP announcements within each

overlay, separate from base network Mrouted

via mrouted.conf pre-processing isolate overlay multicast routing

via boundary on virtual IP interfaces

Nov. 19, 2001 24Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Costs of Encapsulation

Packet MTU limits Layers eat packet space May stress impls.

Bandwidth costs 20% (10% IPSEC’d)

Latency costs 0.02-0.06 msec per hop

Nov. 19, 2001 25Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Layered double tunnels

DATA

DATA Base-Src, Base-Dst

DATA OvlSrc2-OvlDst2 OLinkS2-OLinkD2Ovl-Src, Ovl-Dst

Ovl-Src, Ovl-Dst

OLink-Src, OLink-Dst Base-Src, Base-Dst

Base-Src, Base-Dst

OLink-Src, OLink-Dst

Nov. 19, 2001 26Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Problem: Service Deployment

Action FileGenerator

Script

http ring-ovl

A

B

DC

XB-OM

RD

RD

RD

RD

GenericABone

GeneratorScript

NodeAction

File

NodeAction

FileNode

ActionFile

NodeAction

File

(User Input)Overlay-Specific

Parameters:TCL/ACL, JDK

(XBone-Auto)Node-SpecificParameters:

Ovl Name, IPs, Topology

Nov. 19, 2001 27Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Network Reentrancy

Need VI context sensitive: View of interface list View of ports Logins File systems (for logs)

Nov. 19, 2001 28Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Problem: Recursion

Easy if deterministic One inner layer

Harder if policy-based layering Layer N determines Layer N-1

AA

BBpolicy policy

CC

X Y

Nov. 19, 2001 29Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Recursion solutions:

ARP Treats lower layer like link Needs broadcast

BGP Treat inner network like a transit AS Needs to determine encapsulation

Nov. 19, 2001 30Copyright 2001, J. Touch, USC/ISI. All rights reserved.

––– Projects –––

X-Bone DWIM VI Deployment

DynaBone Multilayer spread-spectrum VIs

NetFS Context-sensitive views

Nov. 19, 2001 31Copyright 2001, J. Touch, USC/ISI. All rights reserved.

– DWIM VIs (X-Bone) –

DWIM concept API Useful defaults (esp. to get around

complexities)

“COTS” distributed management Expanding ring search Soft state with hard backup Heartbeats ACLs and resource management

Nov. 19, 2001 32Copyright 2001, J. Touch, USC/ISI. All rights reserved.

X-Bone Objectives

Dynamically deploy overlay networks user/application setup, monitor, teardown

Via existing stacks in new ways integrate IPsec, dynamic routing

With enhanced capability hierarchical, stackable nodes in multiple overlays,

in a single overlay multiple times

Nov. 19, 2001 33Copyright 2001, J. Touch, USC/ISI. All rights reserved.

X-Bone System View

Web GUI

X-Bone system

Multiple views

Automatedmonitoring

link

xd GUIxd GUI

OverlayManager

OverlayManager

ResourceDaemon

ResourceDaemon

ResourceDaemon

ResourceDaemonResource

Daemon

ResourceDaemon

routerhost

ring-ovl

IP Base

A

B

DC

A

B

DC

star-ovl

A

B

DC

Star Overlay

Base IPv4Network

Ring Overlay

Nov. 19, 2001 34Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Creating the Ring

isipc2

eql

udel seccos div

sin

bbn

Internet

Ring Ovl.

OM

Request Result

Nov. 19, 2001 35Copyright 2001, J. Touch, USC/ISI. All rights reserved.

X-Bone Components

SNMP/RSVPDistributed

Control

Impl.Cartwheels

VIArchitecture

Nov. 19, 2001 36Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Impact Goals

Reduce deployment effort Deploy tools easily, overlays effortlessly Safe configuration, management, monitoring Existing OSs, apps., network infrastructure

Extend network architecture Dynamic, concurrent overlays Recursive / stackable overlays Share in multiple overlays, multiply in one

Nov. 19, 2001 37Copyright 2001, J. Touch, USC/ISI. All rights reserved.

The X-Bone is…

A system for automated overlay deployment among a closed set of trusted hosts and routers provide coordination, configuration, management many details are plug-replaceable

New tricks for overlays (use of overlays) overlays on overlays on overlays on … fault tolerance, service deployment member in multiple overlays, in single multiple

times New tricks for old dogs (extend network arch.)

use existing stacks and applications

Nov. 19, 2001 38Copyright 2001, J. Touch, USC/ISI. All rights reserved.

What We Don’t Do…

Optimize the overlay topology we use a plug-in module (AI folk can provide) it requires network status (emerging now) fault tolerance only via ground truth (admin.

issue) X-Bone is capability more than performance

(now)

Non-IP overlays IP is the interoperability layer IP recurses / stacks nicely

Nov. 19, 2001 39Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Before/after

Task Before X-Bone With X-Bone

Select properties manual

ad-hoc

manual or via program

pick from menus

Select components manual

OOB e-mail, phone

automated

OM finds RDs via multicast

Design manual automated

OM computes topology

Install manual

OOB, telnet, SNMP

automated

OM configures RD via TCP

Monitor Various in-band tools

infer from visible state

X-Bone tools

explicitly monitor state

Dismantle telnet, SNMP, or e-mail

to off-line recorded state

automated

on command, timer-based

Nov. 19, 2001 40Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Related Work Darwin/VNS (CMU)

deploy a reserved core overlay (QoS)

Netscript/VAN (Columbia) deploy a set of virtual NICs in EEs (Anets)

Detour (U. Wash.) / RONs (MIT) patch routing with tunnels

VPNs fence-out, incremental, exclusive, host-focus

Multi-level – MorphNet, Supranet ATM – GUILN, Switchlets Manual overlays – Mbone, 6bone, A-Bone

Nov. 19, 2001 41Copyright 2001, J. Touch, USC/ISI. All rights reserved.

X-Bone Differences

Integrated end-to-end overlays overlays as more than an interim solution extend architecture (IPsec, multihoming)

Recursive Internet architecture runs on IP; provides IP to upper level

Deploying an alpha-grade tool increase sharing, ease setup (CAIRN, AN) simplify applications, user use safe, secure, coordinated

Production use for classes, testbeds

Nov. 19, 2001 42Copyright 2001, J. Touch, USC/ISI. All rights reserved.

X-Bone Users

ISI Network lab – 17 Fbsd/Linux USC CS net lab – 24 Linux, 48 students UCL - 6 Fbsd nodes CAIRN – 10 Fbsd nodes LUT / 3G - Finnish dynamic mobile

svcs Canadian Gov’t (CRC) Project A-Bone – deploying the backbone

Nov. 19, 2001 43Copyright 2001, J. Touch, USC/ISI. All rights reserved.

–AntiDDOS (DynaBone)–

Spread-spectrum parallel defenses RAID for packets

Adaptive configuration Proactive and reactive management

Using existing OS/App/protocols Like X-Bone

Nov. 19, 2001 44Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Performance tradeoffs

Bandwidth

LatencyCPU load

Nov. 19, 2001 45Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Bandwidth variations

Nov. 19, 2001 46Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Latency variations

Nov. 19, 2001 47Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Outerlay

DynaBone architecture

Spread-Spectrum Multilayer Internet Overlays

Innerlays

Base networkBase network

3DES encrypt / Linkstate3DES encrypt / Linkstate

RC5 encrypt / RIPRC5 encrypt / RIP

PRM

PRM

MD5 auth / staticMD5 auth / static

Nov. 19, 2001 48Copyright 2001, J. Touch, USC/ISI. All rights reserved.

OuterlayInnerlays

Base networkBase network

3DES encrypt / Linkstate3DES encrypt / Linkstate

RC5 encrypt / RIPRC5 encrypt / RIP

MD5 auth / staticMD5 auth / static

PRM

PRM

Reacting to attack

X

Nov. 19, 2001 49Copyright 2001, J. Touch, USC/ISI. All rights reserved.

PRM Detail

PRM

Mux

per packet?per TCP?

M

Demux

reorder?drop dups?

Monitor

injectmeasure

DDOSAttack

Detection

PerformanceMetrics

(pathchar)

Nov. 19, 2001 50Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Why use overlays?

PRMs can coordinate FEC-style replicate on each Innerlay, filter

copies at receiver TCP SYN send on high-security Innerlay,

data on high-speed Innerlay; receiver accepts SYNs only from high-security Innerlay

Algorithmic diversity IPsec, routing, DNS, etc.

Nov. 19, 2001 51Copyright 2001, J. Touch, USC/ISI. All rights reserved.

– NetFS –

X-Bone application deployment highlights need for compartmentalized root

Solution: File system API to network config, sockets Extends file system’s fine-grained security Sandboxes services Sandboxes network management Single API for network apps across OS’s

Nov. 19, 2001 52Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Goals

Simple, standard interface Across different OS’s File system API and semantics

Fine-grained security User, group, world, etc. Per instance of each resource

Context-dependent views Limits “ifconfig –a” response

Nov. 19, 2001 53Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Intertwined control

interfaces Socket API

sockopt

ioctl

sysctl

In-band API

routes

communication channels

Nov. 19, 2001 54Copyright 2001, J. Touch, USC/ISI. All rights reserved.

NetFS control

interfaces

NetFSFile API

routes

communication channels

Nov. 19, 2001 55Copyright 2001, J. Touch, USC/ISI. All rights reserved.

/netfs file system

/netfs

iface route ipfwproto

fxp0lo

default alias1 alias2

ether ip

tcp udp

25 26

mask addr

10.0.0.1default

10

addr mask

255.0.0.0

ipsec

10.3.0.0 255.255.0.0

Nov. 19, 2001 56Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Named pipe impl.

/netfs

iface route

fxp0

ip

addr

10.0.0.1

addr mask10.0.0.0 255.0.0.0

10.0.0.1

Read = 10.0.0.1Write =

route symlink

Nov. 19, 2001 57Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Process-context view

/netfs

iface route

BA ZYX

Process A

~netfs

iface route

Process B

~netfs

iface route

Nov. 19, 2001 58Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Related work

Linux’s /procfs Processes

Jail Limits root access to 1 IP addr per

partition Plan 9’s /net

Sockets FreeBSD extensions (underway)

Add naming (kernel hack) to interfaces

Nov. 19, 2001 59Copyright 2001, J. Touch, USC/ISI. All rights reserved.

– Results –

Architcture Two layer tunnels (IETF VPN) Decoupled encapsulation from IPsec (IETF IPsec) X-Bone, DynaBone, and NetFS targets

DWIM system for experiments / testbeds X-Bone in FreeBSD 4.x /usr/ports, Linux 7.x* RPM A-Bone deployment

Implementation fixes Interative forwarding (IETF TSVWG/SCTP) Long list of interfaces (dhcp, etc) IPsec keys on VIFs (FreeS/WAN), no dual-mode

Nov. 19, 2001 60Copyright 2001, J. Touch, USC/ISI. All rights reserved.

Further information

X-Bone http://www.isi.edu/xbone FreeBSD 4.3+ in /usr/ports/net/xbone Linux RPM from website Papers in Global Internet 1998 (at Globecom),

ICNP 2000, Computer Networks July 2001

DynaBone http://www.isi.edu/dynabone (coming soon)