Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address...

26
Distributed Systems Within the Internet Nov. 9, 2011 Topics Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the network 15-440 Distributed Systems

Transcript of Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address...

Page 1: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

Distributed Systems Within the InternetNov. 9, 2011

Distributed Systems Within the InternetNov. 9, 2011

TopicsTopics Domain Name System

Finding IP address Content Delivery Networks

Caching content within the network

15-440Distributed Systems

Page 2: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 2 – 15-440

Domain Name System (DNS)Domain Name System (DNS)

Mapping from Host Names to IP Addresses

Distributed databaseDistributed database Each site (university, large company, ISP, ...) maintains

database with its own entries Provide server for others to query

Implemented at Application LayerImplemented at Application Layer Runs over UDP (normally) or TCP

Page 3: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 3 – 15-440

DNS Name HierarchyDNS Name Hierarchy

Both generic (e.g., “.com”) and country (e.g., “.jp” domains)

Top-level names managed by NIC Other name zones delegated to

different entities

mil edu gov com

cmu berkeleymit

cs ece

greatwhite128.2.220.10

ics

unnamed root

www128.2.217.13

amazon

www72.21.194.1

Top-level domain names

Second-level domain names

Third-level domain names

ae zw

United ArabEmirates

Zimbawe

usUnitedStates

• • • • • •arpa

in-addr

Page 4: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 4 – 15-440

DNS Name TerminologyDNS Name Terminology

Node: Any point in hierarchy Zone: A complete subtree Name Servers: Servers that can

determine IP addresses within given zone With help from other servers

mil edu gov com

cmu berkeleymit

cs ece

greatwhite128.2.220.10

ics

unnamed root

www128.2.217.13

amazon

www72.21.194.1

ae zw

United ArabEmirates

Zimbawe

usUnitedStates

• • • • • •arpa

in-addr

Page 5: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 5 – 15-440

Programmer’s View of DNSProgrammer’s View of DNS Conceptually, programmers can view the DNS database as a

collection of millions of host entry structures:

in_addr is a struct consisting of 4-byte IP address

Functions for retrieving host entries from DNS:Functions for retrieving host entries from DNS: gethostbyname: query key is a DNS domain name. gethostbyaddr: query key is an IP address.

/* DNS host entry structure */ struct hostent { char *h_name; /* official domain name of host */ char **h_aliases; /* null-terminated array of domain names */ int h_addrtype; /* host address type (AF_INET) */ int h_length; /* length of an address, in bytes */ char **h_addr_list; /* null-terminated array of in_addr structs */ };

Page 6: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 6 – 15-440

Properties of DNS Host EntriesProperties of DNS Host Entries

Each host entry is an equivalence class of domain names and IP addresses.

Different kinds of mappings are possible:Different kinds of mappings are possible: Simple case: 1-1 mapping between domain name and IP addr:

greatwhite.ics.cs.cmu.edu maps to 128.2.220.10

Multiple domain names mapped to the same IP address:eecs.mit.edu and cs.mit.edu both map to 18.62.1.6

Multiple domain names mapped to multiple IP addresses:aol.com and www.aol.com map to multiple IP addrs.

Some valid domain names don’t map to any IP address: for example: ics.cs.cmu.edu

Page 7: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 7 – 15-440

DNS Name Server HierarchyDNS Name Server Hierarchy

At each level of hierarchy, have group of servers that are authorized to handle that region of hierarchy

At bottom of hierarchy, have authority server for specific name

edu

cmu

cs

greatwhite128.2.220.10

ics

unnamed root

pdl

imperial128.2.189.40

a.root-servers.net • • • m.root-servers.net

a.edu-servers.net • • •

ny-server-03.net.cmu.edunsauth1.net.cmu.edunsauth2.net.cmu.edu

AC-DDNS-2.NET.cs.cmu.edu.AC-DDNS-1.NET.cs.cmu.edu.AC-DDNS-3.NET.cs.cmu.edu

Page 8: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 8 – 15-440

Nominal Root Name ServersNominal Root Name Servers

13 total

Page 9: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 9 – 15-440

Physical Root Name ServersPhysical Root Name Servers

Several root servers have multiple physical servers Packets routed to “nearest” server by “Anycast” protocol

Page 10: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 10 – 15-440

DNS RecordsDNS Records

Database of Resource Records (RRs)Database of Resource Records (RRs) Classes: IN = Internet Each class defines value associated with type

IN Class TypesIN Class Types A Address

Name = hostname, Value = IP address

NS Name ServerName = domain (e.g., cs.cmu.edu)Value = authoritative name server for this domain

CNAME Canonical Name (alias)Name = alias nameValue = canonical name

MX Mail serverValue = mail server hostname

Format: (class, name, value, type, TTL)

Page 11: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 11 – 15-440

Getting DNS Information with digGetting DNS Information with dig

Perform DNS lookup as would for Perform DNS lookup as would for gethostbynamegethostbyname Lots of command-line options

unix> dig greatwhite.ics.cs.cmu.edu

;; ANSWER SECTION:greatwhite.ics.cs.cmu.edu. 2966 IN A 128.2.220.10

;; AUTHORITY SECTION:cs.cmu.edu. 593 IN NS AC-DDNS-3.NET.cs.cmu.edu.cs.cmu.edu. 593 IN NS AC-DDNS-1.NET.cs.cmu.edu.cs.cmu.edu. 593 IN NS AC-DDNS-2.NET.cs.cmu.edu.

Page 12: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 12 – 15-440

Tracing Hierarchy (1)Tracing Hierarchy (1)Dig ProgramDig Program

Use flags to find name server (NS) Disable recursion so that operates one step at a time

All .edu names handled by set of servers

unix> dig +norecurse @a.root-servers.net NS greatwhite.ics.cs.cmu.edu

;; ADDITIONAL SECTION:a.edu-servers.net. 172800 IN A 192.5.6.30c.edu-servers.net. 172800 IN A 192.26.92.30d.edu-servers.net. 172800 IN A 192.31.80.30f.edu-servers.net. 172800 IN A 192.35.51.30g.edu-servers.net. 172800 IN A 192.42.93.30g.edu-servers.net. 172800 IN AAAA 2001:503:cc2c::2:36l.edu-servers.net. 172800 IN A 192.41.162.30

IP v6 address

Page 13: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 13 – 15-440

Tracing Hierarchy (2)Tracing Hierarchy (2)

3 servers handle CMU names

unix> dig +norecurse @g.edu-servers.net NS greatwhite.ics.cs.cmu.edu

;; AUTHORITY SECTION:cmu.edu. 172800 IN NS ny-server-03.net.cmu.edu.cmu.edu. 172800 IN NS nsauth1.net.cmu.edu.cmu.edu. 172800 IN NS nsauth2.net.cmu.edu.

Page 14: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 14 – 15-440

Tracing Hierarchy (3 & 4)Tracing Hierarchy (3 & 4)

3 servers handle CMU CS names

Server within CS is “start of authority” for this name

unix> dig +norecurse @nsauth1.net.cmu.edu NS greatwhite.ics.cs.cmu.edu

;; AUTHORITY SECTION:cs.cmu.edu. 600 IN NS AC-DDNS-2.NET.cs.cmu.edu.cs.cmu.edu. 600 IN NS AC-DDNS-1.NET.cs.cmu.edu.cs.cmu.edu. 600 IN NS AC-DDNS-3.NET.cs.cmu.edu.

unix>dig +norecurse @AC_DDNS-2.NET.cs.cmu.edu NS greatwhite.ics.cs.cmu.edu

;; AUTHORITY SECTION:cs.cmu.edu. 300 IN SOA PLANISPHERE.FAC.cs.cmu.edu.

Page 15: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 15 – 15-440

Recursive DNS Name ResolutionRecursive DNS Name ResolutionNonlocal LookupNonlocal Lookup

Recursively from root server downward

Results passed up

CachingCaching Results stored in

caches along each hop

Can shortcircuit lookup when cached entry present

edu com

cmu

cs

greatwhite128.2.220.10

ics

unnamed root

someplace

www208.216.181.15

RootServer

.eduServer

CMUServer

CMU CSServer

LocalServer

1

2

3

4

56

7

89

10

Page 16: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 16 – 15-440

Iterative DNS Name ResolutionIterative DNS Name ResolutionNonlocal LookupNonlocal Lookup

At each step, server returns name of next server down

Local server directly queries each successive server

CachingCaching Local server builds up

cache of intermediate translations

Helps in resolving names xxx.cs.cmu.edu, yy.cmu.edu, and z.edu

edu com

cmu

cs

greatwhite128.2.220.10

ics

unnamed root

someplace

www208.216.181.15

RootServer

.eduServer

CMUServer

CMU CSServer

LocalServer

1

2

4

9

6

7

5

8

3

10

Page 17: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 17 – 15-440

Reverse DNSReverse DNS

TaskTask Given IP address, find its name

MethodMethod Maintain separate hierarchy based on IP

names Write 128.2.194.242 as

242.194.128.2.in-addr.arpa

ManagingManaging Authority manages IP addresses

assigned to it E.g., CMU manages name space 128.2.in-

addr.arpa

edu

cmu

cs

kittyhawk128.2.194.242

cmcl

unnamed root

arpa

in-addr

128

2

194

242

Page 18: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 18 – 15-440

.arpa Name Server Hierarchy.arpa Name Server Hierarchy

At each level of hierarchy, have group of servers that are authorized to handle that region of hierarchy

128

2

194

kittyhawk128.2.194.242

in-addr.arpa a.root-servers.net • • • m.root-servers.net

chia.arin.net(dill, henna, indigo, epazote, figwort, ginseng)

cucumber.srv.cs.cmu.edu,t-ns1.net.cmu.edut-ns2.net.cmu.edu

mango.srv.cs.cmu.edu(peach, banana, blueberry)

Page 19: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 19 – 15-440

Performance IssuesPerformance IssuesChallengeChallenge

There’s way too much traffic on the Internet Popular sites (Google, Amazon, Facebook, …) get huge

amounts of trafficCould become “hot spot”

It takes much longer to route packets around world than next door

OpportunitiesOpportunities Services can be replicated

Multiple servers / data centerMultiple data centers around world

Content can be cached

How Can this Work?How Can this Work? Contrary to original Internet model: IP address designates

unique host

Page 20: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 20 – 15-440

Server BalancingServer Balancing

DNS TricksDNS Tricks Customize DNS response to location

Allows distribution by geography

Return multiple host names / queryClient (could) choose one at random

Update DNS entries with new serversRotate loading

Within Data CenterWithin Data Center Keep changing binding between IP address and host

Page 21: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 21 – 15-440

Server Balancing ExampleServer Balancing Example

DNS TricksDNS Tricks Different responses to different servers, short TTL’s

unix1> dig www.google.com

;; ANSWER SECTION:www.google.com. 87775 IN CNAME www.l.google.com.www.l.google.com. 81 IN A 72.14.204.104www.l.google.com. 81 IN A 72.14.204.105www.l.google.com. 81 IN A 72.14.204.147www.l.google.com. 81 IN A 72.14.204.99www.l.google.com. 81 IN A 72.14.204.103

unix2> dig www.google.com

;; ANSWER SECTION:www.google.com. 603997 IN CNAME www.l.google.com.www.l.google.com. 145 IN A 72.14.204.99www.l.google.com. 145 IN A 72.14.204.103www.l.google.com. 145 IN A 72.14.204.104www.l.google.com. 145 IN A 72.14.204.105www.l.google.com. 145 IN A 72.14.204.147

Page 22: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 22 – 15-440

Typical Workload (Web Pages)Typical Workload (Web Pages)

Multiple (typically small) objects per pageMultiple (typically small) objects per pageFrame, body, ads, logos, …

File sizesFile sizesHeavy-tailed

Pareto distribution for tailLognormal for body of distribution

Embedded referencesEmbedded referencesNumber of embedded objects also pareto

Pr(X>x) = (x/xm)-k

This plays havoc with performance. Why?This plays havoc with performance. Why?

Solutions?Solutions?

• Lots of small objects means & TCP• 3-way handshake• Lots of slow starts• Extra connection state

Page 23: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 23 – 15-440

Content Distribution Networks (CDNs)Content Distribution Networks (CDNs)The content providers are the CDN The content providers are the CDN customers.customers.

Content replicationContent replication

CDN company installs hundreds of CDN company installs hundreds of CDN servers throughout InternetCDN servers throughout Internet

Close to users

CDN replicates its customers’ content CDN replicates its customers’ content in CDN servers. When provider in CDN servers. When provider updates content, CDN updates serversupdates content, CDN updates servers

CDNs:CDNs: Akamai Major ISPs

origin server

in North America

CDN distribution node

CDN server

in S. America CDN server

in Europe

CDN server

in Asia

Page 24: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 24 – 15-440

Serving Through CDNServing Through CDN

RequirementRequirement Route HTTP request to CDN node, rather than to original

server

MethodsMethods CDN provider manipulates DNS tables

Rewrite HTML pages <a href=“http://www.nfl.com/images/ben_roethlisberger”>

With <a href=“http://a989.g.akamai.net/nfl/images/ben_roethlisberger”>

unix1> dig www.nfl.com

;; ANSWER SECTION:www.nfl.com. 300 IN CNAME www.nfl.com.edgesuite.net.www.nfl.com.edgesuite.net. 13778 IN CNAME a989.g.akamai.net.a989.g.akamai.net. 20 IN A 96.7.40.32a989.g.akamai.net. 20 IN A 96.7.40.33

Page 25: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 25 – 15-440

Caching Content in CDNCaching Content in CDNSimplisticSimplistic

Each CDN server caches content that flows through it

BetterBetter Create DHT among cluster of servers Origin of Chord led to founding of Akamai

ChallengesChallenges Usual ones of staleness / consistency / replication Handled by TTLs

EffectivenessEffectiveness Can’t cache dynamic content

Responses to individual queriesBut, even dynamic pages contain static links

Great for streaming content If multiple clients viewing same programs ~ simultaneously

Page 26: Distributed Systems Within the Internet Nov. 9, 2011 Topics Domain Name System Finding IP address Content Delivery Networks Caching content within the.

– 26 – 15-440

SummarySummary

DNS one of world’s largest distributed systemDNS one of world’s largest distributed system Operation and authority delegated hierarchically Huge number of queries / second

Many Ways to Reduce / Balance TrafficMany Ways to Reduce / Balance Traffic Contrary to simple unique address / host model Time & location varying DNS entries CDNs