Big Internet Site logosMatt Shanks found an interesting summary of a panel from the recent MySQL Conference & Expo called "Scaling MySQL – Up or Out?" where representatives from some of the internet's top sites gave insights into their database setups.

Think of it this way; wikipedia is … huge. There's suprisingly few things you can search for and not find *some* information on. It's monstrously big. Yet, they host the entirety of it on just 20 database servers.

Facebook runs 30,000 databases, spread across * 1,800 * machines.

Web servers? (the machines that you connect to get a web page like this one)

Wikipedia: 270
Facebook: 1200

It's just… huge.

It is pretty interesting but it's also speculative. There's no indication of how powerful each of those machines are. For all we know, the WikiMedia foundation is pumping four times as much bandwidth and storage space into each of its servers than Facebook is. Granted, there's little denying that Facebook must have some considerable resources, and the information it stores seems likely to take up more room than reams of encyclopaedic text.

But also consider this. The Facebook representative himself claims that the company uses a "cheap switch" technology for scaling, which could imply a more fragile network than they'd otherwise have us believe. On the other hand, it could indicate just the opposite. The point is, it's all pretty vague, not to mention unverifiable.

The only representative that I really believe is the one from YouTube, who seems to be placing the company line more than anything else. He answered "I can not say" to most specific questions and, when asked if the company has anything to worry about at present, happily responded "not at all".

It's also worth noting the relative Alexa ranks of the reps (thanks to Mike). In this day and age, an Alexa number may not mean too much any more but once upon a time it was as close to a definitive "how important are you" figure as was possible.

(1317) Monty Taylor – MySQL
(905) Matt Ingenthron – Sun
(39) John Allspaw – Flickr
(13) Frank Mash – Fotolog
(9) Domas Mituzas – Wikipedia
(6) Jeff Rothschild – Facebook
(2) Paul Tuckfield – YouTube

Mike also points to a video of the panel which you could pore through and analyse if you can stand the poor quality (I couldn't):

I am curious about Frank Mash's answer to the same "anything to worry about" question:

Google has to approve it for our power (cut app servers by ½ by moving from php to java).

Could someone explain this to me please? Google don't fund/manage Fotolog (although they do source the advertising revenue via AdSense). Three years ago Fotolog was making all of its upgrades on the back of a US $2.4m investment from BV Capital. Are they now relying on Google to keep them going?

And besides, do and (assuming that these are the sites described by the firms' respective employees) really get that much traffic to qualify as "big internet sites" alongside these social networking behemoths?

There's just something about the whole panel that has me doubting its veracity.


Purty Blonde

Whilst researching this article, I discovered that the purty blonde who "lost her camera on holiday" and had her impressive shots put up on Facebook last year was actually nothing more than a porn ad.

She was found out back in September of last year, but I didn't notice until now.