A few days ago I gave an overview presentation of Windows Azure to some BizSpark startups. One of the questions that I got asked often during and after the presentation was why they would bother with a Windows Azure Table when there’s a SQL Azure available.
I admit that after the presentation I felt that I have not explained this quite well too. Unfortunately, I don’t think it can be explained in one shot, but rather as series of explanations that eventually build up to one “Aha” moment. So I hope this post will provide helpful information in understanding the existence of Azure Tables and why it’s cheap. In a later post, we will touch on some uses for Azure Tables.
Why an Azure Table is cheaper
To start off, let me point out an interesting characteristic of Tables that will tie all this together: a very simplistic structure.
(Picture from Bing)
When I think of Azure Tables, I always remember IKEA’s brochure explaining how they can sell furniture cheaper than anyone else. (If you need a primer on what IKEA is, here’s a Wikipedia link). They refer to it as “flat-packing” and it is the way they design the furniture pieces so that more can be produced and transported. Assembling the furniture is also generally the responsibility of the buyer, not IKEA’s. As most of you know, when you buy stuff from IKEA, the parts are neatly arranged in the slimmest way possible in a box, and has easy to follow instructions to assemble the parts.
In a very similar way, Tables are much like IKEA’s flat-packing process. It is structured in a very simplistic manner so that it’s easier to transport and cheaper to produce.
Easier To Transport/Move
You might be asking right now, where are we supposed to move the Tables, and why would I do that anyway? Well, first and foremost, “moving” the table is handled by Microsoft for you (explanation coming in a bit). So you need not worry about that part. The interesting question is why Microsoft would need to “move” Tables.
By now (hopefully), you should understand that one of the benefits of cloud computing is being able to provide a platform that will stand up to hardware breakdown. This is done through the concept of having redundant “machines”, which is achieved by having several virtualized instances of your application i.e., I can have 3 virtualized instances of my application so if 1 goes down, the my users can still access my application because their requests are load- balanced to the other 2 instances/machines. Thus we can achieve 100% uptime (in ideal cases).
Now going back to the IKEA analogy, try to imagine IKEA running a sale for $1 dinner table. They’d be swarmed with hungry buyers (high load). Fortunately for IKEA, they have designed and flat-packed each dinner table into slim boxes, so they can stack more of it and sell/deliver more if they need to.
In the same way for Tables, Microsoft is replicating your Tables so that it can stand up to high load. If one of your tables goes down (a defective IKEA dinner table that needs to be replaced), Microsoft re-routes requests to the duplicate copies while it tries to start another instance a.k.a “move” so that you maintain a redundant setup. Microsoft can do this very quickly and easily because Tables, unlike a relational database (say SQL Azure), have a simpler structure that is easier to replicate across several machines. There are no table relationships to worry about (each IKEA dinner table will have its own assembling instructions and tools), so an Azure table could be sitting in one machine, and another table on another, and they don’t have to worry about each other. User “queries” will just be re-routed to wherever Microsoft “moved” the Tables to, and it will independently satisfy the user requests.
In a relational database, it’s much more complex to “move” tables because you’d have to bring along all the other tables related to the table you’re querying (assuming you have relationships across them) to make sure that it works on one machine. Each machine in the redundant set-up would need to have all related tables in one machine for the “JOINS” to work. This would take a lot of time and processing power. Time and power cost money. In Windows Azure Tables, a JOIN statement doesn’t apply, because it was designed that way to make each table independent and easier/faster to replicate. The term for this is a denormalized table.
Now you might be asking, are you saying SQL Azure doesn’t have redundancy? Absolutely not! Redundancy is also built-in for SQL Azure. But as I said, the less-complex structure of an Azure Table makes it much cheaper.
Cheaper To Produce
Now you know why it’s like this: (Pay-As-You-Go Pricing from http://www.microsoft.com/windowsazure/pricing/).
Azure Tables
- $0.15 per GB stored per month
- $0.01 per 10,000 storage transactions
SQL Azure
- Web Edition
- $9.99 per database up to 1GB per month
- $49.95 per database up to 5GB per month
- Business Edition
- $99.99 per database up to 10GB per month
- $199.98 per database up to 20GB per month
- $299.97 per database up to 30GB per month
- $399.96 per database up to 40GB per month
- $499.95 per database up to 50GB per month
The IKEA dinner table is cheap because IKEA designed it with simplicity in mind, it doesn’t take up too much space when moving it, and they let the buyer assemble the table themselves. The non-IKEA dinner table is expensive because it’s already an assembled piece. Moving it around the factory and delivering it to your house is much more hassle than moving a flat box. Also, IKEA can respond to demand much faster because the design of their tables allows them to easily move it around and produce them faster.
At this point I hope one thing is obvious, that to get the benefit out of a cloud computing platform for data storage in terms of standing up to unpredictable load, replication is a necessity. And the easiest, fastest, and cheapest way for Microsoft to replicate your data is to employ a simple structure. Hence, Azure Tables were provided as an option.
In a later post we will explore in more detail how queries can be much faster in Windows Azure Tables, and sample scenarios for using Tables.