DistOS-2011W Akamai and CDN

From Soma-notes
Jump to navigation Jump to search

Fahim Rahman

Introduction

Content Distribution Networks (CDN) have increased their position as a vital distributed system to enhance the use of the internet, as applications and media streaming increase in demand with time. A CDN is a distributed system of computers containing copies of data at various points within a network. This aids in maximizing bandwidth for users to access data from different points within the network. The issue of latency is relieved in terms of how an application or data transfer behaves, as the position of these servers tends to be closer to any group of individual users. At its simplest makeup, a CDN is a mirroring mechanism solving part of the last mile issue to ensure a positive user experience when it comes to streaming media or using a web-based application.

Content and associated applications are dictated on the internet by the cost shouldered by the publisher. An individual publisher, or service provider, can reach a large audience by having the funding to administer the combination of load-balanced servers and fast network connections. This can be a significant barrier to entry for a smaller, unfunded service provider. Popularity and trends on the internet can be measured in a wave formation. A small website can experience something referred to as the “Slashdot” effect, where a certain website will experience a load of traffic all at once, as the wave of popularity comes in. Given that many small-time providers of content are not prepared for this kind of popularity, an unintended downtime will be experienced. The levels of traffic are simply unsustainable to the smaller provider. A plausible solution for this issue lies within a content distribution network.

Mirroring has presented itself as a natural solution to provide static content. This requires a voluntary effort with people using their own servers and networks to lend a hand to a provider perceived with value that they want to support. Peer to peer networks and file sharing also displays the effort individual users are willing to put forth to distribute this valued content. The sustainability of this effort is questionable given that it requires on the value proposition to be high enough among a user base with resources to spare. This paper will explore in detail two specific content distribution mechanisms; Coral-CDN, a publicly available resource and Akamai Technologies, a commercial solution. Many other CDNs are available out there, but these two solutions will be explored in detail. To be revealed will be their general approach (section 2), their technical approach (section 3), and the user experience (section 4). A discussion will follow, highlighting issues that each solution experiences and needs to consider for the future (section 5) as well as a conclusion on the CDN space as it relates to the use of the internet (section 6).

CDN Origins

The foundations of CDNs lie in mirroring of websites. A mirror site is a separate site that is set up as an identical copy of another site, commonly used to increase access to identical information. This has been advantageous in the easiest sense to solve the latency issue; a website based in Canada could be mirrored in Australia to allow Australian users quicker access to it. Many issues abound with this approach including synchronization, load requirements and BGP concerns.

Local clustering provides another solution that offers improved scalability. The fault of this approach lies in the fact that there is still a single point of failure, in that if the ISP fails, the entire site will suffer. Significant planning is required of the site/content provider, as the resident servers at any location have to be able to handle these peak loads.

Evaluated Systems/Programs

CoralCDN

CoralCDN offers a structure which leads to the democratization of content publication. It is structured in part as a peer-to-peer network, which makes use of voluntary aggregate bandwidth to minimize the effect of a mass amount of traffic to a particular website. CoralCDN empowers any user to use the service by appending a simple string to a URL. Coral-flow.gif

The process is as follows:

1. A client sends a DNS request to its local resolver by appending “.nyud.net:8090” to the URL. As an example, “http://www.x.com.nyud.net”

2. Using the Coral DNS server, the client’s resolver attempts to resolve the hostname. A starting point will likely be at the .net domain and what’s registered under it.

3. A Coral DNS server determines a round-trip time by probing the client.

4. The probe results allow the DNS server to check Coral for any known nameservers and/or HTTP proxies close to the client.

5. If a server is found via Coral, the DNS server returns this information. If none are found, a rand set of nameservers and proxies are returned. The DNS server is to be close to the client as it only returns nodes that are close to itself.

6. A Coral HTTP proxy for www.x.com.nyud.net is returned by the client’s resolver

7. The specified proxy is sent by the client. The process continues, unless the proxy has a cache of the file locally, in which case it returns the file and stops.

8. The object’s URL is looked up in Coral by the proxy.

9. The proxy gets the object from the node if Coral returns the address of a node with the object cached. If this is not the case, the proxy downloads the object from the originating server.

10. The client browser gets the web object from the proxy.

11. Coral now has a reference to the proxy that is now caching the URL.

Akamai

Akamai began as an academic exercise at MIT in the late 1990s. It has now grown within a decade to provide a service for many high profile content providers. Broadcast networks and application service providers have benefited handsomely from the distributed nature that Akamai orients itself on. Simplistically, Akamai installs thousands of servers within thousands of networks to relay the service that their clients demand.

Akamai-flow.gif

Experiences/Comparison

CoralCDN

Consider the use case of an internet user (browser) seeking out some content from a specific provider. The provider, or user, can call on the Coral system (Resolver) to retrieve the content in question. The user or provider simply needs to append “.nyud.net:8090” to the URL to make use of the system. In this example, a specific image file is requested from a website, using the CoralCDN system.

The example used was a website experiencing high traffic at the time of study. http://www.livethesheendream.com was used as an example as the site experienced the “Slashdot” effect at time of study. (http://www.torontosun.com/entertainment/celebrities/2011/03/02/17469286.html)

From packet captures, when accessing the base image http://livethesheendream.com/images/sheen.jpg, the server IP was traced to 64.207.144.170, resolving to a server in Culver City, California, United States. When attempting to access the appended URL of http://livethesheendream.com.nyud.net:8090/images/sheen.jpg , the packet trace revealed that the content came from 130.127.39.152, resolving to Anderson City, South Carolina, United States). This server is geographically closer to the user in Ottawa, showing that the algorithm and process is helpful.

Discussion

What was interesting? What was surprising? Here you can go out on tangents relating to your work

Conclusion

Summarize the report, point to future work.

References

Globally distributed content delivery (Akamai) - Accessed Feb. 10, 2011

Coral-CDN paper - Accessed Feb. 18, 2011

The Slashdot Effect (Wikipedia) - Accessed Feb. 24, 2011

Akamai - Why the Edge? - Accessed Feb. 25, 2011

How to build your own CDN... - Accessed Feb 24, 2011

The Design of CoralCDN - Accessed Feb 27, 2011

Akamai - State of the Internet - Accessed January 25, 2011

Akamai - Online Video Publishers - Accessed February 28, 2011