How AKAMAI works ... details of secret technology
rescued from http://web.archive.org/web/20030302122757/http://www.cs.washington.edu/homes/ratul/akamai.html
Here are my conjectures on how Akamai works. These are based on some experiments done on April 4th, 2000 using mainly "dig". These were conducted from some machines at UW and one at MIT.
1. What an Akamaized page looks like?Suppose you enter www.cnn.com in your browser. This fetches the index.html file from the cnn server. In that file there will be images which will be pointing to the Akamai servers. Those URLs look like
- The number after "a", I think, identifies the customer. So 388 is cnn, 1380 is jcpenny and 620 is computer.com. Note that it is crucial to have different machine name for each customer as will become clear later.
- I am not sure what 7 stands for but it was present in almost of the Akamaized URLs I saw.
- Next is again the customer identifier.
- What the following two identifiers (21 and fade2068e7503e for cnn) represent is not fully clear. A plausible explanation (courtesy Neal Cardwell) is that the 14 digit hex strings are checksums of the content that path refers to. That way the name always changes if the content changes so the akamai caches at the edge don't have to worry about consistency or freshness.
- Next is the customer url itself. The path after that is identical to the path on the customer machine. So the above jcpenny URL and "www1.jcpenney.com/images/homepagev4/homepage/catalog.gif" lead to the same gif.
* update (courtesy John Jacob): The hex string is created using "md5sum [file_name] | cut -c3-16". It can also be replaced by a cache time-to -live value like "1d" for one day, and "15m" for 15 mins.
2. DNS Black MagicI do not know the DNS system inside out, so the information here could be incomplete or simply wrong. Believe it at your own risk. Below is the chronology of steps that happen when an object from an Akamai server is to be fetched.
Step 1From the top level domain you first get the name server of akamaitech.net domain. Interesting things happen here itself. There are 8 name servers reported z[A-H].akamaitech.net This information obtained is good on the scale of days.
ZA.akamaitech.NET. 3h30m6s IN A 220.127.116.11
ZB.akamaitech.NET. 3h30m6s IN A 18.104.22.168
ZC.akamaitech.NET. 3h30m6s IN A 22.214.171.124
ZD.akamaitech.NET. 3h30m2s IN A 126.96.36.199
ZE.akamaitech.NET. 3h30m2s IN A 188.8.131.52
ZF.akamaitech.NET. 1d44m5s IN A 184.108.40.206
ZG.akamaitech.NET. 23h24m25s IN A 220.127.116.11
ZH.akamaitech.NET. 23h24m25s IN A 18.104.22.168
zA.akamaitech.NET. 1d16m48s IN A 22.214.171.124
ZB.akamaitech.NET. 1d16m48s IN A 126.96.36.199
ZC.akamaitech.NET. 1d16m48s IN A 188.8.131.52
ZD.akamaitech.NET. 1d16m48s IN A 184.108.40.206
ZE.akamaitech.NET. 1d16m48s IN A 220.127.116.11
ZF.akamaitech.NET. 1d16m48s IN A 18.104.22.168
ZG.akamaitech.NET. 1d16m48s IN A 22.214.171.124
ZH.akamaitech.NET. 1d16m48s IN A 126.96.36.199
- The machines at MIT and UW see different information. This could be because they access different NET name servers.
- Another interesting observation here is that IP addresses in the two outputs are more or less (one exception) permutation of each other. What is achieved by having a different IP address for different names is not clear to me. But having the IP addresses returned in different order might achieve load balancing.
- The name servers are n[0-9]g.akamaitech.net.
- This information can be cached from 30mins to 1hour.
- The IP addresses of name servers returned are different for different client locations (IP addresses).
- The set of name server IPs returned to a particular client changes with time. So if you access cnn.com at different times (separated by DNS cache expiry) you could be contacting different name servers and thus downloading objects from different servers.
n1g.akamaitech.NET. 27m48s IN A 188.8.131.52
n2g.akamaitech.NET. 16m45s IN A 184.108.40.206
n7g.akamaitech.NET. 31m44s IN A 220.127.116.11
n3g.akamaitech.NET. 18m14s IN A 18.104.22.168
n4g.akamaitech.NET. 18m14s IN A 22.214.171.124
n8g.akamaitech.NET. 18m14s IN A 126.96.36.199
n5g.akamaitech.NET. 18m14s IN A 188.8.131.52
n0g.akamaitech.NET. 18m14s IN A 184.108.40.206
n6g.akamaitech.NET. 18m14s IN A 220.127.116.11
n2g.akamaitech.NET. 46m13s IN A 18.104.22.168
n3g.akamaitech.NET. 16m13s IN A 22.214.171.124
n8g.akamaitech.NET. 16m13s IN A 126.96.36.199
n4g.akamaitech.NET. 16m13s IN A 188.8.131.52
n5g.akamaitech.NET. 16m13s IN A 184.108.40.206
n6g.akamaitech.NET. 16m13s IN A 220.127.116.11
n0g.akamaitech.NET. 16m13s IN A 18.104.22.168
n1g.akamaitech.NET. 31m13s IN A 22.214.171.124
n7g.akamaitech.NET. 31m13s IN A 126.96.36.199
My belief is that this is THE step where all the Akamai DNS magic is. It will hand out different set of name server IPs to client contacting from different IP addresses. How it determines the nearest set of servers from IP addresses is anybody's guess (proprietary, but allegedly, they use BGP peering with ISPs that host the Akamai cluster, thus giving them a rough estimate of the distance of requesting user from that site - courtesy Tommy Larsen). Apart from wire latency other factors they claim to consider are load on their servers and Internet congestion. They also claim to be able to monitor their servers in real time (once per second). This means that gives out different sets of name server IPs at different times, which explains the short lifetime (30mins - 1hour) of this information. For instance following is part of dig output from the same UW machine at different times (or even different machines at same time - see below).
n2g.akamaitech.NET. 46m11s IN A 188.8.131.52
n3g.akamaitech.NET. 16m11s IN A 184.108.40.206
n8g.akamaitech.NET. 16m11s IN A 220.127.116.11
n4g.akamaitech.NET. 16m11s IN A 18.104.22.168
n5g.akamaitech.NET. 16m11s IN A 22.214.171.124
n6g.akamaitech.NET. 16m11s IN A 126.96.36.199
n0g.akamaitech.NET. 16m11s IN A 188.8.131.52
n1g.akamaitech.NET. 31m11s IN A 184.108.40.206
n7g.akamaitech.NET. 31m11s IN A 220.127.116.11
@UW2 (same machine at a different time)
n3g.akamaitech.NET. 21m28s IN A 18.104.22.168
n4g.akamaitech.NET. 21m28s IN A 22.214.171.124
n6g.akamaitech.NET. 21m28s IN A 126.96.36.199
n5g.akamaitech.NET. 21m28s IN A 188.8.131.52
n0g.akamaitech.NET. 21m28s IN A 184.108.40.206
n7g.akamaitech.NET. 36m28s IN A 220.127.116.11
n1g.akamaitech.NET. 36m28s IN A 18.104.22.168
n2g.akamaitech.NET. 21m28s IN A 22.214.171.124
n8g.akamaitech.NET. 21m28s IN A 126.96.36.199
This has another nice (for Akamai load balancing), or not so nice (for things like organization wide caches) side effect. Different machines could be downloading the same object from different Akamai servers at the same time, if those machines connect to different primary name servers within in the department (like our setup at UW-CSE).In the final step you go to one of the ng.akamaitech.net name server and get the IP address of the machine you are looking for (e.g.: a388.g.akamaitech.net). The server will return two IP addresses for each machine name. For instance, see the partial dig output below
Observe that the lifetime of this information is a mere 20 seconds (which corresponds to one or two web pages viewed) . So after this time period you will go back to the Akamai name server to get the IP addresses. What this means is that even if both the machines go down, it is highly unlikely that this will be seen by the client. (assuming that the Akamai name servers find this out and return a different IP in a failure scenario).
It is likely that all the machines don't host the content of all the customers. Suppose that there are three servers (three different machines) - A,B and C at a particular site (a site will typically have multiple machines). And the customers are X, Y and Z. So A will host X,Y, B will host Y,Z and C will host Z,X. This kind of an arrangement has a two-fold advantage
1) No server has to host all the customers' content. Easing the load on it and also making the content serving faster.
2) If any one server goes down, no customer is fully disconnected as there is another server (potentially more) with its tree.
A customer's content could be present at more than two servers at a site, but it makes sense to return the same two machines (till they are up) because of file caching, the object does not have to retrieved from the disk most of the times.
Another observed feature is that all the name servers in the set (n[0-9]g.akamaitech.net) returned in above step, give the same two IP addresses for a queried server (like a1388.g.akamaitech.net). This could mean that the configuration of all the servers in a set is identical and multiple servers are there just for sharing the load.
Akamai caches (courtesy Neal Cardwell)
The akamai machines at the edge are PCs running Linux and a slightly modified version of the squid cache. They are doing on-demand caching rather than push-based replication.
Name server differences
An artifact of the above exploration is the observation that different versions of named might be running in the department (@UW-CSE).
While 188.8.131.52 (bs4) always returns the two IP addresses in different order (to get some sort of load balancing), 184.108.40.206 (bs1) does no such thing.
And the one at MIT (220.127.116.11) seems to be returning the IP addresses in a random order.
Akamai Documents (courtesy Jay Bivens, Bob Devine)
As stated above, these are just conjectures. Corrections/Comments welcome.
Last Modified : 9/03/01
A leaked Federal Aviation Administration memo written on the evening of Sept. 11 contains disturbing revelations about American Airlines Flight 11, the first to hit the World Trade Center. The "Executive Summary," based on information relayed by a flight attendant to the American Airlines Operation Center, stated "that a passenger located in seat 10B shot and killed a passenger in seat 9B at 9:20 a.m. The passenger killed was Daniel Lewin, shot by passenger Satam Al Suqami."
The FAA has claimed that the document is a "first draft," declining to release the final draft, as it is "protected information," noting the inaccuracies in reported times, etc. The final draft omits all mention of gunfire. Lewin, a 31 year-old dual American-Israeli citizen was a graduate of MIT and Israel's Technion. Lewin had emigrated to Israel with his parents at age 14 and had worked at IBM's research lab in Haifa, Israel. Lewin was a co-founder and chief technology officer of Akamai Technologies, and lived in Boston with his family. A report in Ha'aretz on Sept. 17 identified Lewin as a former member of the Israel Defense Force Sayeret Matkal, a top-secret counter-terrorist unit, whose Unit 269 specializes in counter-terrorism activities outside of Israel.
The videos of the impact corroberate each other (mostly, but fakes do exist!)
Therefore some of the videos must be authentic.
Since (hollow aluminium) airplanes do not fly through steel and concrete, the image of the aeroplane that we all saw must have been a holographic projection.. probably onto a missile..
and here it is.
Lewin, a graduate of MIT and Israel's Technion, lived with his wife, Anne, and two sons, Eitan and Itamar, in Brookline, Mass., where he helped run Akamai Technologies -- which he co-founded, nearly becoming a billionaire in the dot-com stock boom. He previously worked for IBM's research lab in Haifa, Israel. His parents and brothers all live in Israel.
Lewin belonged to Sayeret Matkal - this outfit specializes in aircraft hostage rescues
Daniel Lewin of Akamai. Before founding Akamai, Lewin was a captain in Sayeret Matkal, a top-secret Israeli anti-terrorist
In the OFFICIAL FAIRY TALE of Flight 11 (which did not exist!), Lewin was seated one row behind two of the hijackers and one row in front of one other hijacker. What are the odds that an Israeli anti-terrorism soldier would be sitting near all of these hijackers?
Akamai Technologies, Inc. (NASDAQ: AKAM) is a company that provides a distributed computing platform for global Internet content and application delivery, headquartered in Cambridge, Massachusetts. The company was founded in 1998 by then-MIT graduate student Daniel Lewin, along with MIT Applied Mathematics professor Tom Leighton and MIT Sloan School of Management students Jonathan Seelig and Preetish Nijhawan. Leighton still serves as Akamai's Chief Scientist, while Lewin was killed aboard American Airlines flight 11 which was crashed in the September 11, 2001 attacks. Akamai is a Hawaiian word meaning smart or intelligent.
Akamai transparently mirrors content (usually media objects such as audio, graphics, animation, video) stored on customer servers. Though the domain name (but not the subdomain) is the same, the IP address points to an Akamai server rather than the customer's server. The Akamai server is automatically picked depending on the type of content and the user's network location.
In addition to image caching, Akamai provides services which accelerate dynamic and personalized content, J2EE-compliant applications, and streaming media to the extent that such services frame a localized perspective.
As of 2008, Akamai started to release a quarterly "State of the Internet" report, where they present data and trends regarding traffic and bandwidth adoption. Their first report gathered information about 125 countries.
Akamai's customers include many large internet, media and computer companies including the BBC.
Arabic news network Al-Jazeera was a customer from 28 March 2003 until 2 April 2003, when Akamai decided to end the relationship. The network's English-language managing editor claimed this was due to political pressure.
In March 2005, Akamai signed an agreement to acquire Speedera Networks for 12 million shares of Akamai common stock, valued at $130 million at that time. Both companies also agreed to halt pending lawsuits involving trade secrets and patent infringement. The acquisition was completed in June 2005.
On April 12, 2007 Akamai acquired Red Swoosh in exchange for 350,000 shares of Akamai common stock. The acquisition of Red Swoosh was valued at approximately $15 million, net of cash acquired.