Content Delivery Networks

14
CONTENT DELIVERY NETWORKS 2015 UNIVERSITY NAME HERE Student name here

description

In this week write a paper investigating and evaluating the functionality of Content Delivery Networks (CDNs). Find and use a minimum of five technical resources from either NCU’s library or creditable web sources.In your paper be sure to:Analyze the general operations of a CDN using an appropriate diagram and professional terminology,Analyze how CDN applies to Google and Project Gutenberg,Evaluate alternative methodologies for selecting the appropriate node for content delivery,Construct brief rules that you would to select the appropriate node for a movie database.

Transcript of Content Delivery Networks

content delivery networkS

2015UNIVERSITY NAME HEREStudent name here

content delivery networkS

Table of ContentsWhat is content delivery network?3How does a CDN operate?3Architecture of CDNs41.Basic Fabric Layer52.Communication and Connectivity Layer63.CDN Layer64.End-User Layer6How Google tapped the powers of CDN?6Project Gutenburg7

What is content delivery network?A content delivery network or content distribution network (CDN) is a way of providing some alternative server nodes for users to download their required resources. It is a large distributed system of servers that are physically located at multiple data centers across the Internet. The main aim of building CDN is to arrange for content for end-users with high availability and best download performance. Quick availability of content on Internet across the world is supported by CDNs today. The downloadable content include web objects such as textual data, graphics, images, scripts, media files, personal and official level documents, software, applications for e-commerce or portal purposes, live streaming media data, on-demand streaming data, social network files and much more.How does a CDN operate?Most CDNs get operated as an application service provider (ASP) on the Internet. This model also resembles the Software as a Service model. Internet network owners create their CDNs in order to accelerate on-net content delivery requirements, decrease dependencies on the telecommunications infrastructure and generate enormous revenue from customers interested in varied and high data content. Generally a CDN operate gets paid by the content providers such as e-commerce companies for delivery their content to the end-users. Also, a CDN is supposed to pay to the ISPs, network carriers and operators for helping them host their servers in the already built high-end data centers. Content providers save substantially in terms of costs by availing benefits of CDNs as these networks offload data from content provider's infrastructure and are able to provide high-speed access to this data further to users along with good degree of protection from DoS attacks and many other security risks. In recent times, most CDNs follow hybrid model built with the help of P2P technology where the content is server using both the dedicated CDN servers and peer-user-owned computers.The following figure illustrates how CDN enables its users to download data from closest geographical location.

Figure 1 CDN Servers at different geographical locationsEach node in the CDN (also termed as Edge Server) is responsible for caching the static content of the site passing through it such as text, images, CSS/JS files, structural components. As the majority of the end-users downloading bandwidth and time gets wasted in downloading this content, CDN is able to make it faster for the end-users by storing these building blocks and serving them to the user at the first go. The edge server at the closest geographical location shall serve the user with the stored content at a much faster rate in comparison to the original server and this results into reduced latency and consequently faster browsing experience.Architecture of CDNsThe content delivery networks follow a layered architecture as follows:

Figure 2 Layered Architecture of CDN1. Basic Fabric Layer: This the lower-most layer that provides infrastructural resources for the formation of CDN. It contains file servers, clusters, SMP, index servers and other basic network nodes connected together by high bandwidth network. Each of these nodes have running copy of operating system, distributed file management system and content indexing and management systems.2. Communication and Connectivity Layer: The core communication protocols such as TCP/IP, FTP form this layer along with CDN specific protocols like Internet Cache Protocol (ICP), Hypertext Caching Protocol (HTCP), Cache Array Routing Protocol (CARP). Authentication is also taken care of at this layer with the help of SSL and PKI. Application specific interfaces provide efficient search and retrieval functionality for replicated content based on distributed indexes at this layer.3. CDN Layer: As the name suggests, this layer serves with the core functionality of CDN. It further consists of CDN services layer, CDN types layers and CDN content types layer. Services of request routing, caching data, geographical servers load balancing, user SLA management, resource sharing, CDN brokering are taken care here. 4. End-User Layer: This is the top-most layer consisting of web-users who will actually content to the CDN by specifying the URL of the content providers web site in their web browsers.How Google tapped the powers of CDN?The official implementation of edns-client-subnet IETF Internet Draft was done by Google in August 2011, to localize DNS responses. The leading DNS and CDN service providers came together to use the IP address of the end-user while resolving DNS requests. Google made its Google Public DNS available at multiple servers worldwide using the content delivery network technology for faster accessing of DNS addresses. Google's PageSpeed project was also aimed to tap the functions of CDN to launch an online service for speeding up the downloading of web pages. The end-users need to sign up and point their most accessed website's domain as an entry into Google's PageSpeed. The service will fetch the content from the respective servers, rewrite pages with best performance practices and serve them before the end-users via Google servers across the globe. The website will get downloaded as before but with multiplied speed improvements of 25% to 60%. To provide such a service to its users, Google has used the realms of CDN at its back. Google Drive itself can be used by the end-users to create their own personalized content delivery system. The users can store varied content types in Google Drive and get faster access to it as Google distributes the physical location of this data in optimized manner using CDN models.Project GutenburgThis project is a volunteering effort done to digitize and archive creative and culture works and to encourage the creation and distribution of e-books across the world. It is a kind of large digital library and was founded in the year 1971 by Michael S. Hart. It consists of thousands of e-books of varied and inconsistent content formats. These e-books are hosted at multiple servers physically located at diverse geographical locations and replicated to many others. The books that are downloaded from this project are more or less users to keep. Project Gutenburg has partnered with many other initiatives like that of Overdrive to provide access to public domain and royalty free e-books to its users with best performance based on their location.Request Routing in CDNs and alternativesIn CDNs, a request routing system is used for routing of end-users content requests to appropriate edge servers for the delivery of content. Basically, the request needs to be forwarded to the edge server that is located closest to the end-user's machine. However, issues may arise if the closest server may not be the best selection of the server for the user. Therefore, some other metrics come into picture such as network proximity, client perceived latency, physical distance and load on the server. The chosen content selection and delivery techniques pose direct impact upon the underlying design of the request routing system to be used by a CDN. The various alternatives used for achieving request routing are as follows:1. DNS Based Routing: A local DNS server is hosted by the CDN service provider which can return the address of the edge server nearest to the end-user for a specific domain. If the local DNS cache misses the entry, the request gets forwarded to the DNS root server that can return the address of the authoritative DNS server. The authoritative DNS server has the capability to return addresses of multiple edge servers which can be used one after the other.2. Transport layer Based Routing: A more appropriate edge server can be fetched by using the transport layer details of the requesting client like its IP address, port number, available first packet from the client.3. Application layer Based Routing: In this approach, a finer grained routing is achieved by utilizing DNS based routing for individual content item. The methods of do this are header inspection and content modification. Reference [6] details this technique.4. Content layer Based Routing: Special content routers are used for supporting naming. These are helpful for both IP based and name based routing. Most of these are installed with firewalls, gateways and border gateway protocol routers. Reference [7] details this technique.Selection of appropriate movie nodeThe most apt method for request routing for a movie node should be based upon content layer based routing. A movie is composed of different chunks of data. It is better to divide the movie request into multiple edge server requests based on the type of content in individual chunks. For example, the music in a video can be fetched from a different server node, whereas the star cast listing part can be fetched from another server and so on. Different scenes of a movie can be streamed at different rate and thus can be retrieved from servers at different locations. Also, it is often seen that movies are successfully delivered based upon popularity based routing where some specific chunks of the movie have higher popularity rating than others and are requested by users with greater frequency. The variation in the popularity distribution may lead the CDN servers to position the right content at the right locations in dynamic manner.

References1. Li, Jin. "On peer-to-peer (P2P) content delivery". Peer-to-Peer Networking and Applications 1 (1): 4563. doi:10.1007/s12083-007-0003-1.2. Hofmann, Markus; Leland R. Beaumont (2005). Content Networking: Architecture, Protocols, and Practice. Morgan Kaufmann Publisher. ISBN 1-55860-834-6.3. RFC 3568 Barbir, A., Cain, B., Nair, R., Spatscheck, O.: "Known Content Network (CN) Request-Routing Mechanisms," July 20034. Google gets into the Content Delivery Network business | ZDNet. [ONLINE] Available at:http://www.zdnet.com/article/google-gets-into-the-content-delivery-network-business/. [Accessed 15 April 2015].5. Google Drive As A Content Delivery Network (CDN) For Your Website. [ONLINE] Available at: http://www.komku.org/2013/12/google-drive-as-a-free-cdn.html. [Accessed 15 April 2015].6. B. Cain, F. Douglis, M. Green, M. Hofmann, R.Nair, D. Potter, and O. Spatscheck, "Known CDN Request-Routing Mechanisms", http://www.contentalliance.org/docs/draft-caincdnp-known-req-route-00.html (work in progress), November 2000.7. Mark Gritter, David R. Cheriton, "An Architecture for Content Routing Support in the Internet", http://www.dsg.stanford.edu/papers/contentrouting/, 2001.8. B. Cain, F. Douglis, M. Green, M. Hofmann, R. Nair, D. Potter, and O. Spatscheck, "Known CDN Request-Routing Mechanisms", http://www.contentalliance.org/docs/draft-caincdnp-known-req-route-00.html(work in progress), November 2000.