Difference between revisions of "DistOS-2011W Akamai and CDN"

From Soma-notes
Jump to navigation Jump to search
Line 18: Line 18:
=Evaluated Systems/Programs=
=Evaluated Systems/Programs=
[[File:coral-flow.gif]]
[[File:coral-flow.gif]]
The process is as follows:
1. A client sends a DNS request to its local resolver by appending “.nyud.net:8090” to the URL.  As an example, “http://www.x.com.nyud.net”
2. Using the Coral DNS server, the client’s resolver attempts to resolve the hostname.  A starting point will likely be at the .net domain and what’s registered under it.
3. A Coral DNS server determines a round-trip time by probing the client. 
4. The probe results allow the DNS server to check Coral for any known nameservers and/or HTTP proxies close to the client.
5. If a server is found via Coral, the DNS server returns this information.  If none are found, a rand set of nameservers and proxies are returned.  The DNS server is to be close to the client as it only returns nodes that are close to itself.
6. A Coral HTTP proxy for www.x.com.nyud.net is returned by the client’s resolver
7. The specified proxy is sent by the client.  The process continues, unless the proxy has a cache of the file locally, in which case it returns the file and stops. 
8. The object’s URL is looked up in Coral by the proxy.
9. The proxy gets the object from the node if Coral returns the address of a node with the object cached.  If this is not the case, the proxy downloads the object from the originating server.
10. The client browser gets the web object from the proxy.
11. Coral now has a reference to the proxy that is now caching the URL.


[[File:akamai-flow.gif]]
[[File:akamai-flow.gif]]

Revision as of 01:35, 7 March 2011

Fahim Rahman

Introduction

Content Distribution Networks (CDN) have increased their position as a vital distributed system to enhance the use of the internet, as applications and media streaming increase in demand with time. A CDN is a distributed system of computers containing copies of data at various points within a network. This aids in maximizing bandwidth for users to access data from different points within the network. The issue of latency is relieved in terms of how an application or data transfer behaves, as the position of these servers tends to be closer to any group of individual users. At its simplest makeup, a CDN is a mirroring mechanism solving part of the last mile issue to ensure a positive user experience when it comes to streaming media or using a web-based application.

Content and associated applications are dictated on the internet by the cost shouldered by the publisher. An individual publisher, or service provider, can reach a large audience by having the funding to administer the combination of load-balanced servers and fast network connections. This can be a significant barrier to entry for a smaller, unfunded service provider. Popularity and trends on the internet can be measured in a wave formation. A small website can experience something referred to as the “Slashdot” effect, where a certain website will experience a load of traffic all at once, as the wave of popularity comes in. Given that many small-time providers of content are not prepared for this kind of popularity, an unintended downtime will be experienced. The levels of traffic are simply unsustainable to the smaller provider. A plausible solution for this issue lies within a content distribution network.

Mirroring has presented itself as a natural solution to provide static content. This requires a voluntary effort with people using their own servers and networks to lend a hand to a provider perceived with value that they want to support. Peer to peer networks and file sharing also displays the effort individual users are willing to put forth to distribute this valued content. The sustainability of this effort is questionable given that it requires on the value proposition to be high enough among a user base with resources to spare. This paper will explore in detail two specific content distribution mechanisms; Coral-CDN, a publicly available resource and Akamai Technologies, a commercial solution. Many other CDNs are available out there, but these two solutions will be explored in detail. To be revealed will be their general approach (section 2), their technical approach (section 3), and the user experience (section 4). A discussion will follow, highlighting issues that each solution experiences and needs to consider for the future (section 5) as well as a conclusion on the CDN space as it relates to the use of the internet (section 6).

CDN Origins

The foundations of CDNs lie in mirroring of websites. A mirror site is a separate site that is set up as an identical copy of another site, commonly used to increase access to identical information. This has been advantageous in the easiest sense to solve the latency issue; a website based in Canada could be mirrored in Australia to allow Australian users quicker access to it. Many issues abound with this approach including synchronization, load requirements and BGP concerns.

Local clustering provides another solution that offers improved scalability. The fault of this approach lies in the fact that there is still a single point of failure, in that if the ISP fails, the entire site will suffer. Significant planning is required of the site/content provider, as the resident servers at any location have to be able to handle these peak loads.

Evaluated Systems/Programs

Coral-flow.gif

The process is as follows:

1. A client sends a DNS request to its local resolver by appending “.nyud.net:8090” to the URL. As an example, “http://www.x.com.nyud.net”

2. Using the Coral DNS server, the client’s resolver attempts to resolve the hostname. A starting point will likely be at the .net domain and what’s registered under it.

3. A Coral DNS server determines a round-trip time by probing the client.

4. The probe results allow the DNS server to check Coral for any known nameservers and/or HTTP proxies close to the client.

5. If a server is found via Coral, the DNS server returns this information. If none are found, a rand set of nameservers and proxies are returned. The DNS server is to be close to the client as it only returns nodes that are close to itself.

6. A Coral HTTP proxy for www.x.com.nyud.net is returned by the client’s resolver

7. The specified proxy is sent by the client. The process continues, unless the proxy has a cache of the file locally, in which case it returns the file and stops.

8. The object’s URL is looked up in Coral by the proxy.

9. The proxy gets the object from the node if Coral returns the address of a node with the object cached. If this is not the case, the proxy downloads the object from the originating server.

10. The client browser gets the web object from the proxy.

11. Coral now has a reference to the proxy that is now caching the URL.

Akamai-flow.gif

Experiences/Comparison (multiple sections)

In multiple sections, describe what you learned.

Discussion

What was interesting? What was surprising? Here you can go out on tangents relating to your work

Conclusion

Summarize the report, point to future work.

References

Globally distributed content delivery (Akamai) - Accessed Feb. 10, 2011

Coral-CDN paper - Accessed Feb. 18, 2011

The Slashdot Effect (Wikipedia) - Accessed Feb. 24, 2011

Akamai - Why the Edge? - Accessed Feb. 25, 2011

How to build your own CDN... - Accessed Feb 24, 2011

The Design of CoralCDN - Accessed Feb 27, 2011

Akamai - State of the Internet - Accessed January 25, 2011

Akamai - Online Video Publishers - Accessed February 28, 2011