Indirection - TU Berlin · Multicast groups Class D Internet ... Peer-to-peer networks ... monitors...

79
Indirection Indirection: rather than reference an entity directly, reference it (“indirectly”) via another entity, which in turn can or will access the original entity 1 "Every problem in computer science can be solved by adding another level of indirection" -- Butler Lampson A B x

Transcript of Indirection - TU Berlin · Multicast groups Class D Internet ... Peer-to-peer networks ... monitors...

Indirection

Indirection: rather than reference an entity directly,

reference it (“indirectly”) via another entity, which in turn can or will access the original entity

1

"Every problem in computer

science can be solved by

adding another level of

indirection"

-- Butler Lampson

A

B

x

Multicast: one sender to many receivers

Multicast: Act of sending datagram to multiple receivers with single “transmit” operation

Analogy: One teacher to many students

Question: How to achieve multicast

2

Network multicast

Router actively participate in multicast, making copies of packets as needed and forwarding towards multicast receivers

Multicast routers (red) duplicate and forward multicast datagrams

Internet Multicast Service Model

multicast group concept: use of indirection

hosts addresses IP datagram to multicast group

routers forward multicast datagrams to hosts that have “joined” that multicast group

3

128.119.40.186

128.59.16.12

128.34.108.63

128.34.108.60

multicast group

226.17.30.197

Multicast groups

Class D Internet addresses reserved for multicast:

Host group semantics:

o anyone can “join” (receive) multicast group

o anyone can send to multicast group

o no network-layer identification to hosts of members

Needed: Infrastructure to deliver mcast-addressed datagrams to all hosts that have joined that multicast group

4

Joining a mcast group: Two-step process

Local: Host informs local mcast router of desire to join group: IGMP (Internet Group Management Protocol)

Wide area: Local router interacts with other routers to receive mcast datagram flow

many protocols (e.g., DVMRP, MOSPF, PIM)

5

IGMP IGMP

IGMP

wide-area multicast routing

Multicast via Indirection: Why?

Don't need to individually address each member in the group: header savings

Looks like unicast; application interface is simple, single group

Abstraction, delegating works of implementation to the routers

More scalable because, sender doesn't manage the group, as receivers are added, new receivers must do the work to add themselves

6

How do you contact a mobile friend?

Search all phone books?

Call her parents?

Expect her to let you know where he/she is?

7

I wonder where Alice moved to?

Consider friend frequently changing addresses, how do you find her?

Mobility and Indirection

Mobility and indirection:

Mobile node moves from network to network

Correspondents want to send packets to mobile node

Two approaches:

Indirect routing: Communication from correspondent to mobile goes through home agent, then forwarded to remote

Direct routing: Correspondent gets foreign address of mobile, sends directly to mobile

8

Mobility: Vocabulary

9

Home network: permanent “home” of mobile (e.g., 128.119.40/24)

Permanent address: address in home network, can always be used to reach mobile e.g., 128.119.40.186

Home agent: entity that will perform mobility functions on behalf of mobile, when mobile is remote

wide area network

correspondent

Mobility: more vocabulary

10

Care-of-address: address in visited network. (e.g., 79,129.13.2)

wide area network

Visited network: network in which mobile currently resides (e.g., 79.129.13/24)

Permanent address: remains constant (e.g., 128.119.40.186)

Foreign agent: entity in visited network that performs mobility functions on behalf of mobile.

Correspondent: wants to communicate with mobile

Mobility: registration

End result:

Foreign agent knows about mobile

Home agent knows location of mobile

11

wide area network

home network visited network

1

mobile contacts foreign agent on entering visited network

2

foreign agent contacts home agent home: “this mobile is resident in my network”

Mobility via Indirect Routing

12

wide area network

home network

visited network

3

2

4 1

correspondent addresses packets using home address of mobile

home agent intercepts packets, forwards to foreign agent

foreign agent receives packets, forwards to mobile

mobile replies directly to correspondent

Indirect Routing: comments

Mobile uses two addresses:

Permanent address: used by correspondent (hence mobile location is transparent to correspondent)

Care-of-address: used by home agent to forward datagrams to mobile

Foreign agent functions may be done by mobile itself

Triangle routing: correspondent-home-network-mobile

Inefficient when

correspondent, mobile

are in same network

13

Indirect Routing: moving between networks

Suppose mobile user moves to another network

Registers with new foreign agent

New foreign agent registers with home agent

Home agent update care-of-address for mobile

Packets continue to be forwarded to mobile (but with new care-of-address)

Mobility, changing foreign networks transparent: Ongoing connections can be maintained!

14

Mobility via Direct Routing

15

wide area network

home network

visited network

4

2

4 1 correspondent requests, receives foreign address of mobile

correspondent forwards to foreign agent

foreign agent receives packets, forwards to mobile

mobile replies directly to correspondent

3

Mobility via Direct Routing: comments

Overcome triangle routing problem

Non-transparent to correspondent: Correspondent must get care-of-address from home agent

What happens if mobile changes networks?

16

Mobile IP

RFC 3220

Has many features we’ve seen:

home agents, foreign agents, foreign-agent registration, care-of-addresses, encapsulation (packet-within-a-packet)

3 components to standard:

agent discovery

registration with home agent

indirect routing of datagrams

17

Mobility via indirection: why indirection?

Transparency to correspondent

“Mostly” transparent to mobile (except that mobile must register with foreign agent)

transparent to routers, rest of infrastructure

potential concerns if egress filtering is in place in origin networks (since source IP address of mobile is its home address): spoofing?

18

An Internet Indirection Infrastructure

Motivation:

Today’s Internet is built around point-to-point communication abstraction: Send packet “p” from host “A” to host “B” One sender, one receiver, at fixed and well-known

locations … not appropriate for applications that require other

communications primitives: multicast (one to many) mobility (one to anywhere) anycast (one to any)

We’ve seen indirection used to provide these services Idea: Make indirection a “first-class object”

19

Internet Indirection Infrastructure (I3)

Change communication abstraction: Instead of point-to-point, exchange packets by name

each packet has an identifier ID

to receive packet with identifier ID, receiver R stores trigger (ID, R) into network

triggers stored in network overlay nodes

20

Sender Receiver (R)

ID R

trigger

send(ID, data) send(R, data)

Service Model

API

sendPacket(p);

insertTrigger(t);

removeTrigger(t); // optional

best-effort service model (like IP)

triggers periodically refreshed by end-hosts

reliability, congestion control, flow-control implemented at end hosts, and trigger-storing overlay nodes

21

Discussion

Trigger is similar to routing table entry

Essentially: Application layer publish-subscribe infrastructure

Application-level overlay infrastructure

Unlike IP, end hosts control triggers, i.e., end hosts responsible for setting and maintaining “routing tables”

Provide support for

mobility

multicast

anycast

composable services

22

Mobility

Receiver updates its trigger as it moves from one subnet to another

mobility transparent to sender

location privacy

23

Sender Receiver

(R1) ID R1

send(ID,data) send(R1, data)

Mobility

Receiver updates its trigger as it moves from one subnet to another

mobility transparent to sender

location privacy

24

Sender ID R1

send(ID,data)

Receiver

(R2)

send(R2, data)

Multicast

Unifies multicast and unicast abstractions multicast: receivers insert triggers with same ID

Application naturally moves between multicast and unicast, as needed “impossible” in current IP model

25

Sender Receiver (R1) ID R1

send(ID,data) send(R1, data)

Receiver (R2)

ID R2

send(R2, data)

Anycast (cont’d)

Route to any one in set of receivers

Receivers i in anycast group inserts same ID,

with anycast qualifications

26

Sender

Receiver (R1)

ID|s1 R1 send(ID|a,data)

Receiver (R2) ID|s2 R2

ID|s3 R3

Receiver (R3)

send(R1,data)

Composable Services

Use stack of IDs to encode successive operations to be performed on data (e.g., transcoding)

Don’t need to configure path between services

27

Sender

(MPEG)

Receiver R

(JPEG)

ID_MPEG/JPEG S_MPEG/JPEG ID R

send((ID_MPEG/JPEG,ID), data)

S_MPEG/JPEG

send(ID, data) send(R, data)

Composable Services (cont’d)

Both receivers and senders can specify operations to be performed on data

28

Receiver R

(JPEG) ID_MPEG/JPEG S_MPEG/JPEG

ID (ID_MPEG/JPEG, R)

send(ID, data)

S_MPEG/JPEG

Sender

(MPEG) send((ID_MPEG/JPEG, R), data)

send(R, data)

Discussion of I3

How would receiver signal ACK to sender? What is needed?

Does many-to-one fit well in this paradigm?

security, snooping, information gathering: what are the issues?

29

Content Delivery Networks: Indirection with DNS

30

Internet Content

Content is

static web pages and documents

images and videos, streaming, …

Content becomes more and more important!

500 exabytes (1018) created in 2008 alone [Jacobson]

Estimated inter-domain traffic rate: 39.8 TB/s [Labovitz]

Annual growth rate of Internet traffic: ~40%-60% [Labovitz]

Much of web growth due to video (Flash, RTSP, RTP, YouTube, etc.)

How to deliver content?

How to cope with growth of content?

31

Following slides adapted from Wolfgang Mühlbauer

Application mix

32 Source: Alexandre Gerber and Robert Doverspike. Traffic Types and Growth in Backbone Networks. AT&T Labs – Research 2011.

Application mix in 2009

HTTP dominates

33

Inside HTTP

Flash-video dominates

Images and RAR files next

Source: Gregor Maier et al. On Dominant Characteristics of Residential Broadband Internet Traffic. IMC 2009.

Prevalence of CDNs

30 (out of ~30000) ASes contribute 30% of inter-domain traffic

July 2009: CDNs originate at least 10% of all inter-domain traffic

Top ten origin ASes in terms of traffic

34 Source: Craig Labovitz et al. Internet Inter-Domain Traffic. SIGCOMM 2010.

Why not Serving Content from One’s Own Site?

Enormous demand for popular content Cannot be served from single server

Bad performance Due to large distance: TCP-througput depends on RTT! Bad connectivity?

Single point of “failure” High demand leads to crashes or high response times (e.g., flash crowds)

High costs Bandwidth and disk space to serve large volumes (e.g., videos)

35

consumer

content

Download AS A

AS B

AS D

AS C

AS E

Approaches to Content Delivery

Idea: replicate content and serve it locally

Centralized hosting

Content distribution networks (CDN) Offload content delivery to large number of content servers Put content servers near end-users

Peer-to-peer networks In theory: infinite scalability Yet: download capacity throttled by uplink capacity of end

users

36

Replicate

Download content from closest location

Akamai – A Large CDN

Akamai (Hawaiian: “intelligent”) Evolved out of MIT research effort: handle flash crowds

> 70000 Servers located in 72 countries, > 1000 ASs

Customers: Yahoo!, Airbus, Audi, BMW, Apple, Microsoft, etc.

Why using Akamai? Content consumer: Fast download

Content provider: Reduce infrastructure cost, quick and easy deployment of network services

Task of CDNs: Serve content Static web content: HTML pages, embedded images, binaries …

Dynamic content: break page into fragments; assemble on Akamai server, fetch only noncacheable content from origin website:

Applications: audio and video streaming

37

Akamai: Is the Idea Really That Novel?

Local server cluster Bad if data center or upstream ISP fails

Mirroring Deploying clusters in a few locations Each mirror must be able to carry all the load

Multihoming Using multiple ISPs to connect to the Internet Each connection must be able to carry all the load

Akamai vastly increases footprint

monitors and controls their worldwide distributed servers directs user requests to appropriate servers handles failures

38

Akamai Relies on DNS Redirection

Example: Access of Apple webpage (www.apple.com)

Pictures are hosted by Akamai: images.apple.com

Type: dig images.apple.com into your Linux shell

39

[…] ;; ANSWER SECTION: images.apple.com. 3016 IN CNAME images.apple.com.edgesuite.net. [more CNAME redirections] images.apple.com.edgesuite.net.globalredir.akadns.net. 2961 IN CNAME a199.gi3.akamai.net. a199.gi3.akamai.net. 10 IN A 184.84.182.56 a199.gi3.akamai.net. 10 IN A 184.84.182.66 […]

DNS redirects request to DNS servers controlled by Akamai!

Akamai Deployment

Edge server organized as “content cluster”

in many Autonomous Systems

multiple servers

local “low-level” DNS server

40

Client directed to “closest” server

Content cluster

Content cluster

Content cluster

Content cluster

How does Akamai Work? (simplified)

41

Web Client

Local DNS Server Apple Authoritative

DNS Server

Apple Web Server

Root DNS Server

Top-Level Domain DNS Server

www.apple.com ?

www.apple.com ?

www.apple.com ?

www.apple.com ?

Slide adapted from “Drafting Behind Akamai”, Sigcomm 2006

Normal web request:

First DNS resolval

Then HTTP connection IP address

IP address

How does Akamai Work? (simplified)

42

Web Client

Local DNS Server Apple Authoritative

DNS Server

Apple Web Server

Root DNS Server

Top-Level Domain DNS Server

CNAME: images.apple.com.edgesuite.net.

images.apple.com ?

Slide adapted from “Drafting Behind Akamai”, Sigcomm 2006

images.apple.com ?

DNS request for "Akamized" content:

Results in CNAME

How does Akamai Work? (simplified)

43

Web Client

Akamai High-Level DNS Server

Akamai Low-Level DNS Server Local DNS Server

Apple Authoritative DNS Server

Apple Web Server

Akamai Edge Server

Root DNS Server

Top-Level Domain DNS Server

CNAME: a199.gi3.akmai.net

Slide adapted from “Drafting Behind Akamai”, Sigcomm 2006

CNAME: images.apple.com.edgesuite.net.

images.apple.com ?

images.apple.com ?

images.apple.com.edgesuite.net ?

images.apple.com.edgesuite.net ?

2 IP addresses of Akamai edge servers

How does Akamai Work? (simplified)

44

Web Client

Akamai High-Level DNS Server

Akamai Low-Level DNS Server Local DNS Server

Apple Authoritative DNS Server

Apple Web Server

Akamai Edge Server

Root DNS Server

Top-Level Domain DNS Server

CNAME: a199.gi3.akmai.net

a199.gi3.akamai.net ?

2 IP addresses of Akamai edge servers

Slide adapted from “Drafting Behind Akamai”, Sigcomm 2006

CNAME: images.apple.com.edgesuite.net.

images.apple.com ?

images.apple.com ?

images.apple.com.edgesuite.net ?

fetch image files

a199.gi3.akmai.net ?

Two-level server assignment

Akamai top-level DNS server Anycasted

Selects location of “best” content cluster

Delegates to content cluster’s low-level name server

TTL 1 hour

Akamai low-level server Return IP addresses of servers that can satisfy the

request: consistent hashing

TTL 20 seconds: quick adoption to load conditions

Most CDNs use similar techniques Some CDNs rely on Anycast to send traffic to closest

content server (e.g., Limelight)

45

What is the „best“ location?

Service requested Server must be able to satisfy the request (e.g.,

QuickTime stream)

Server health Up and running without errors

Server load Server’s CPU, disk, and network utilization

Network condition minimal packet loss to client, sufficient bandwidth to

handle requests

Client location Server should be close to client, e.g., in terms of RTT

46

Continuous measurement effort

Number of hops between ASes

Live network statistics (e.g., traceroute)

Load of data centers/content servers

Report load of content servers to local DNS servers

Report content cluster load to the top-level DNS resolver to direct traffic away from overloaded content clusters

Entire system’s health end-to-end

Agents that simulate end-user behavior by downloading web objects

Measure failure rates and download times

Monitor individual customers/services

What is the busiest customer, etc.?

47

Inverse view: Impact of DNS resolver choice

48

Importance of DNS for CDN Redirection

CDN only learns IP address of DNS resolver, not of host

49

host

DNS resolver

CDN cache

CDN cache CDN cache

1. DNS query

2. CDN selects “closest”/ “best” cache

What Happens if Alternate DNS Resolver is Chosen?

CDN thinks that host is in Google or OpenDNS network

50

host

DNS resolver

CDN cache

CDN cache CDN cache

1. DNS query

Alternate resolver,e.g. GoogleDNS: 8.8.8.8 OpenDNS: 208.67.222.222

What Happens if Alternate DNS Resolver is Chosen?

CDN thinks that host is in Google or OpenDNS network

51

host

DNS resolver

CDN cache

CDN cache CDN cache

1. DNS query

2. CDN selects “closest”, “best” cache

Alternate resolver,e.g. GoogleDNS: 8.8.8.8 OpenDNS: 208.67.222.222

Data and Approach

Provide custom script to “friends of friends”

Run on 50 commercial ISPs

All around the globe

Query 10k+ top hostnames from Alexa

DNS resolvers: local resolver, Google DNS, OpenDNS

Collect

DNS response times

Returned IP addresses, which provide information about

• subnet (/24)

• Autonomous systems (AS)

• and country from where content is fetched

52 Source: Bernhard Ager et al. DNS in the Wild. IMC 2010.

DNS response times

53

3rd-party resolvers sometimes better performance

Source: Bernhard Ager et al. DNS in the Wild. IMC 2010.

Local DNS vs. Google DNS: Where are we directed?

For 2000 out of 10000 queries: subnets are different

In half of these cases: different AS and country

54 Source: Bernhard Ager et al. DNS in the Wild. IMC 2010.

Indirection may cause inefficiency

Choice of DNS resolver is crucial

It is a criteria for determining from where content is retrieved

For content locality you have to use the local resolver

Google DNS, OpenDNS etc. may lead to sub-optimal choice of content server

Recent IETF activity

Include IP address of original host in DNS request

55

How to mitigate?

So far we have seen

CDNs rely on resolver location only

Extensive measurements to find best client-server mappings

3rd-party resolvers mess up the system

Provider-aided Distance Information System: PaDIS

Knows network topology and conditions

Finds better content servers: good for users

Reduces network load: good for ISPs

56

Following slides adapted from Georgios Smaragdakis and Anja Feldmann.

PaDIS

57

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

PaDIS

58

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

Full View of the ISP Network

PaDIS

59

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

Full View of the ISP Network

Content can be downloaded from any eligible host!

PaDIS

60

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

Full View of the ISP Network

Host1

Host2

Host3

Host4

PaDIS

61

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

Full View of the ISP Network

Host1

Host2

Host3

Host4 Host2

Host4

Host3

Host1

PaDIS

62

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

3

4

6 PaDIS

5

7

7

Full View of the ISP Network

Case 1: Network load balancing

63

Client

Host A

Host B

Host C

Case 1: Network load balancing

64

Client

Host A

Host B

Host C

Case 1: Network load balancing

65

Client

Host A

Host B

Host C

Case 2: ISP-CDN collaboration

66

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

5

3

6 PaDIS

4

7

7

Case 2: ISP-CDN collaboration

67

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

5

3

6 PaDIS

4

7

7

Host1

Host2

Host3

Host4

Host2

Host4

Host3

Host1

Case 2: ISP-CDN collaboration

68

Client

External DNS

Provider DNS

Internet Service Provider (ISP)

Host

1

2

5

3

6 PaDIS

4

7

7

Host1

Host2

Host3

Host4

Host2

Host4

Host3

Host1

Host2

Host3

Host1

Host4

Summary

Alternative traffic engineering

Do not change the routing

Change the traffic matrix!

Benefits

ISPs: Regain control of network traffic

User: Performance improvements

Win-win situation for ISPs and end-users ISPs can share benefits with content and application providers

PADIS

Simple and easy to implement

Prototype running

69

Secure Overlay Service

SoS: An overlay network, using indirection and randomization to provide legitimate users (only) with denial-of-service free access to a server.

Overlay network:

Network or distributed infrastructure with common network services (e.g., routing) built “on top” of other networks

Example: Distributed application in which application-layer nodes relay messages among themselves, using underlying IP routing to get from one site to another

70

Performing a DoS Attack

1. Select Target to attack

71

2. Break into accounts around the network

3. Have these accounts send packets toward the target

Goal of Secure Overlay Service (SoS)

Pre-approved legitimate users communicate with target

legit users may be mobile (IP addresses change)

Un-approved (attackers’) packets don’t reach target

72

attackers

target

legit user

Step 1 – Filtering

Routers “near” target filter packets based on IP addr

IP addresses from legitimate user allowed through

IP addresses from illegitimate users are not

73

Concerns:

Bad users have same IP address as good user?

Bad users know good user’s IP address: spoofing?

Good IP address changes frequently (mobility)?

Step 2 – indirection via a proxy

74

w.x.y.z

Use proxy, outside filtered region

Proxy, being a computer (rather than router) can perfom heavy-weight authentication, access control

Only packets from proxy permitted through filter

Proxy only forwards verified packets from legitimate sources through filter

Problems with a known Proxy

Proxies introduce other problems

Attacker can breach filter by attacking with spoofed proxy address

Attacker can DoS attack proxy, again preventing legitimate user communication

75

w.x.y.z

I’m w.x.y.z

I’m w.x.y.z

I’m w.x.y.z

Step 3 – Multiple proxies with secret forwarding

Create many proxies (too many to attack)

Target specifies small set of proxies as secret forwarders

Only secret-forwarder packets pass through filter

Only secret forwarders know they are secret forwarders (other proxies unaware)

To get host packet to target

Host contacts any proxy (which checks legitimacy)

Proxy randomly routes packet to another proxy

If destination proxy is secret forwarder, packet forwarded to target, otherwise packet randomly routed to another proxy

76

SOS with “Random” routing

With filters, multiple proxies, and secret forwarder(s), attacker cannot “focus” attack

77

proxy

? secret forwarder

SoS

Why indirection?

Ultimate destination address is unknown (hackers can not attack target, only attack proxies (?))

Address of target only known to small number of secret forwarders, which rotate and can change

Issues:

Why can’t hacker just try all addresses of all proxies to get through?

78

Indirection: Summary

We’ve seen indirection used in many ways:

multicast

mobility

Internet indirection

CDNs

SoS

The uses of indirection:

Sender does not need to know receiver id – do not want sender to know intermediary identities

Load balancing

Beauty, grace, elegance

Transparency of indirection is important

Performance: is it more efficient?

Security: Important issue for I3 79