Wednesday, February 10, 2010

Cloud Computing: Shared-Disk vs. Shared-Nothing

Anant Jhingran (IBM’s CTO, Information Management, Analytics and Optimization) challenged our assertion that the cloud benefits the shared-disk database architecture. For me to enter into a battle of technical vision with Anant is equivalent to bringing a knife to a gun battle, but I enjoy a good challenge.

1. Cloud storage: Anant argues that (a) SANs won’t beat local disk in costs; (b) many shared-nothing databases use SANs anyway. To quote Inigo Montoya from Princess Bride: “Let me ‘splain. No there is too much. Let me sum up

Response: (a) While some clouds use traditional SAN or NAS storage, the trend among clouds is to assemble large collections of low-cost disks using a cluster file system to handle disk striping and data redundancy thereby providing SAN-like capabilities. As a result, the economics are quite similar to those of local disk; (b) We play in the MySQL market, where the vast majority of the databases use the local disk, making the comparison quite valid…for us. That said, we find that MySQL also commands a large percentage of the installed base on the cloud, making the comparison valid in general.

My broader point: Historically, the shared-nothing database had many advantages over the shared-disk database, particularly in the area of shared storage. Two major factors were at play: (1) shared storage was very expensive; (2) shared-disk databases split the storage performance across multiple nodes, meaning that performance of “Z” meant that a 4-node shared-disk database would only deliver 1/4 x Z to each node, making it expensive to deliver comparable performance on a per node basis. The cloud minimizes (and is on a trajectory to eliminate) shared-nothing’s historical advantage in these areas, by getting cheaper and faster. By rendering these traditional shared-nothing advantages moot, the two architectures are able to compete on other attributes, where shared-disk excels, such as operational simplicity and dynamic elasticity. These advantages are particularly relevant to the cloud. Shared-disk actually reduces costs in the cloud because it: (i) eliminates the need for redundant slaves (since each node provides fail-over to the other nodes); (ii) provides more evenly balanced load, since nodes are not specialized; (iii) supports dynamic elasticity at the database node level, where you only use/pay for the instances you need at the time.

2. Network Bandwidth: Anant suggests that this point is moot in comparing traditional and cloud computing.

Response: Maybe in the IBM/DB2 world where “many of the shared-nothing implementations of our clients use SANs”, but this is not the case in the MySQL world. Network performance plays a huge part in comparing shared-storage vs. local storage. Again from a historical perspective, back when shared-nothing became all the rage and MySQL took off as the M in LAMP, Ethernet and Fast Ethernet were a serious bottleneck on shared-disk performance. Now, with Gigabit Ethernet, Fiber Channel and Infiniband, there is further leveling of the playing field. This is not cloud specific. Improvements in network performance leveled the playing field between the two database architectures, but the storage costs described above still played a big part. It was after the cloud changed the economics on storage that we began to see a reassessment of the traditional bias for shared-nothing.

3. Virtualization: Scaling up/down stateless CPUs is easier in the shared-disk architecture. But global state (e.g. locks) undermine the independence of the virtualized nodes. In addition, the database typically likes to take control of the entire stack.

Response: ScaleDB does not take control of the entire stack, instead it is VM-friendly. ScaleDB’s implementation of the shared-disk model relies on a centralized lock manager, which also coordinates buffers and recovery among the nodes. It serves to coordinate the independent actions of the nodes, not to control them. They continue to act independently, from the perspective of the application. This combination makes ScaleDB very cloud friendly. You can surely argue that shared-nothing can scale to a larger number of nodes, but (a) most applications can get by with 50 or fewer database nodes; and (b) the process of scaling database nodes and maintaining those nodes in a shared-nothing cluster is quite painful.

If your argument is that shared-nothing has less state, it does, but it imposes more state information on the application, load balancer and the storage than shared-disk, so it is a trade-off. The key is to manage state in a scalable manner as we do in the ScaleDB lock manager.

4. As I understand it, the argument is for duplicate machines and distributed data that are loosely coupled, enabling rapid kill/restart in case of failure. The argument being that this is easier in shared-nothing.

Response: If I understand your point correctly, this would be easier in shared-disk. Shared-nothing introduces complexity in keeping replicates, backups, general database file reorganization, and QOS issues in a multi-tenant environment. By avoiding this pain, shared-disk is easier to maintain than shared-nothing. In short, the kill/redirect model of shared-disk provides faster response to failure that the kill/restart model employed by shared-nothing, and it is far easier to maintain.

Conclusion: In answer to points #1 and #2 above, advances in networking and storage have narrowed the gap between shared-disk. Cloud economics have then made this powerful shared storage economically compelling. For points, #3 and #4, the advantage goes to shared-disk. In addition, the natural synergy between cloud computing and shared-disk database go much further:

a. Instead of using a fixed partitioning model like shared-nothing, shared-disk is dynamically elastic. You can add storage capacity and compute capacity on the fly without interruption or additional work. In addition to the flexibility this affords to the developer, it also enables scaling on demand. The static partitioning model of shared-nothing invariably results in reserving over-capacity to accommodate for usage spikes and future growth. Since the cloud enables on-demand allocation of resources on a pay-per-use model, shared-disk is simply more compatible with the cloud.

b. The elimination of the partitioning/sharding of data and the replication, promotion and synching of slaves reduces the burden on the user and on the cloud administrator. Look closely at Amazon’s RDS and you’ll see that these things are disabled because they are a pain to maintain. The simplicity of the shared-disk architecture wins this as well.

c. Economics 1: See the cloud database white paper I wrote on this. Compute instances are more expensive than storage in the cloud. Since shared-disk generally uses fewer compute instances—by eliminating slaves and through better distribution of database requests via cluster-level load balancing—the cost of a shared-disk system will, in most cases, be lower than shared-nothing.

d. Economics 2: Since shared-disk is more dynamic, enabling scaling on the fly, one can replace a large instance used by the more rigid shared-nothing database, with a collection of smaller instances. Given the disproportionate increase in pricing of large instances, relative to aggregate performance of less expensive smaller instances, it is more economical to use shared-disk in the cloud. Consider, for example, using a 10-node shared-disk cluster costing $.85 per hour versus a single Quadruple Extra Large Instance for a shared-nothing database costing $2.40 per hour (costing almost three times as much). Then consider that you could scale down to two nodes in the shared-disk example during slow times, paying only $.17, instead of maintaining the $2.40 per hour shared-nothing database.

I maintain my assertion that both network performance and cloud storage have leveled the playing field for the underlying economic and performance comparisons between shared-nothing and shared-disk databases. On such a level technical and economic field, the functionality, availability and operational ease-of-use delivered by shared-disk make it a superior solution for OLTP clustering in the cloud.


  1. knife to a gun-fight? i am a pacifist :) :)

    well reasoned above. MySQL might be (probably is) a different environment than DB2. So some of the scores I gave might be biased. However, what I have not understood is why do you have to have a "cloud" umbrella for what is essentially a good idea -- shared disks work well in some workloads, and making shared disks happen for MySQL is important, and ScaleDB is delivering that. Why cloudify :) an essentially simple argument?

    I agree that "state" in shared-nothing is maintained somewhere, but typically you do that by having some "keystone", and hardening it. Once you do that, the rest of the infrastructure is very flexible and resilient to failures (and here I am not talking implementation of database x or database y, which always has some warts), but in principle, I find it difficult to argue against this.

  2. Anant, Clearly the shared-disk DBMS (SD) has broad appeal for workloads independent of the cloud. If we had the resources and installed base of IBM, we would be more general purpose. But we have neither, so we need to pick our shots, and focus our resources where we can get the most bang for the buck.

    SD, by definition, relies on fast shared storage. For onsite IT, this storage can involve a lot of money (for the MySQL world), but in the cloud, fast storage is almost free and requires no capital investment. This lowers a barrier ro entry for us.

    Also, in the cloud we solve a big problem. One of the compelling selling points of cloud is elasticity + pay-per-use. This model fails at the database layer (or is very difficult and messy for both the cloud vendor and the developer). SD makes it easy.

    Also, the cloud is filled with innovators and early adopters, making it an ideal launchpad for our solution. If someone is willing to try the cloud, there is a very good chance that they are willing to try ScaleDB and SD behind their old favorite MySQL.

    We also get good leverage from the cloud. We can support a smaller universe of interactions (e.g. various cluster file systems, storage devices, etc.) and still get a huge initial bang for the buck, while we backfill these things over time.

    The cloud vendors, who really need this stuff, can also be a very powerful channel for us.

    Now, that said, think of the cloud as a launching pad for us. I think it has tremendous upside, but over the long-term, we can evolve into private cloud and onsite IT as well. In fact, about 50% of the customers who approach us want to run it onsite.

    I understand your state issue, we'll have to show you our stuff ;-)

    Regarding bringing a knife to a gun fight, overwhelming force is the best way to ensure pacifism. If both combatants have equivalent force, that is when trouble starts. ;-)

  3. you are funny. sure, we should get together so that I can understand more. Jnan, you, me?

  4. Thanks for sharing. The information you have shared is good. I appreciate it. Thanks and keep updating.
    Online Great Plains