With the introduction of each new platform, comes the opportunity for new thinking, new applications and new winners. DEC and Oracle were beneficiaries of the move to the minicomputer. Microsoft was the main beneficiary of the move to the PC. Sun rode the workstation to fame. Today’s exciting new platform is the cloud, and one of the upstart contenders is NoSQL.
One might argue that the cloud is merely the hosting of well established platforms such as the PC. Larry Ellison has made this very claim. However, the cloud is very different.
How is the cloud different? Sometimes when you combine things, the combination is very different than the components. For example, Salt (NaCl) is very different from its poisonous individual components. Cloud computing enjoys a similar combinatory effect. Sure it is merely a mixture of PC platforms, virtualization, lots of Linux and low-cost scalable disk arrays. But the combination is more about dynamic on-demand elasticity, elimination of capital expense, instant access to compute resources (versus slow hardware requisitioning), reduced IT headcount hassles, etc. In other words, cloud computing is no longer about the components, it is more about changing how we think about and use computing resources; it is a new paradigm for the consumption of computing resources.
With this new paradigm, comes a new mentality. Cloud developers expect that all aspects of the cloud to scale dynamically. This is where the shared-nothing SQL database comes up short. It is also where the NoSQL option excels.
We in the SQL world could easily dismiss NoSQL, saying NoSQL = NoEnterprise. How can you build a real application on something that doesn’t offer transactions, data consistency, SQL, etc. Real database people turn up their noses at those little key-value pair NoSQL toys. Not so fast.
SimpleDB just fired a shot across the bow of the database big boys with forced consistency. Sure you pay a price for this, and it should only be invoked when it is truly needed, but the point is you CAN do it. The history of technology is littered with the bodies of high-end products that were cannibalized from below, as lighter-weight platforms won the price/volume game. Cloud will definitely win the price/volume game; you simply cannot beat the economics. The question is who will win the cloud database war.
NoSQL databases (e.g. Cassandra, SimpleDB, BigTable, CouchDB, Mongo DB, etc.) will continue to nibble away at the rationale for sticking with big SQL databases. As the leading web database, MySQL became the de facto cloud database, since web and Web 2.0 applications were the early adopters of the cloud. But MySQL cannot rest on its laurels. NoSQL solutions are nipping at MySQL’s heels and their dynamic elasticity is quite appealing.
Now enterprise customers are beginning to move to the cloud. At the same time, NoSQL solutions are adding capabilities once reserved to relational databases. This raises a LOT of questions:
1. Will NoSQL undermine its scalability as it adds more enterprise capabilities (Will these extensions bolt on smoothly or will they result in an awkward and ultimately unscalable Frankenstein)?
2. Will the big SQL database vendors continue to dismiss NoSQL as toys, or will they see them for the threat they are becoming (Should we expect the commercial database vendors to start buying NoSQL solutions)?
3. Will MySQL be the first to succumb to the NoSQL onslaught (Did Oracle just buy yesterday’s cloud database leader)?
4. Will a third-party candidate like ScaleDB, with its shared-disk architecture win with a “best of both worlds” approach that scales dynamically and provides enterprise SQL capabilities?
5. Will SQL and NoSQL co-exist as different tools for different problems, or with they evolve into direct competitors across most major segments?
My Thoughts:
At the moment, SQL databases and NoSQL are different tools for different problems. I think this remains the case, but I believe that NoSQL will spread its reach by adding capabilities that begin to eat into traditional relational database segments. I suspect that the large commercial database companies, after ignoring NoSQL for too long, will resort to buying some of them and integrating them into their product portfolios. Companies focused solely on worldwide scalability like Google, will remain wedded to NoSQL, because any technology that doesn’t scale to 10,000 servers is a non-starter. Enterprises will take a “right tool for the job” approach, employing all of the above.
NoSQL and map-reduce technologies will excel in non-transactional roles like data warehouses, business intelligence (DW/BI). In the OLTP space, SQL databases will remain far more prominent. However, the pain of dynamically scaling shared-nothing databases—and sharding is a pain—will create a need for the dynamically elastic shared-disk databases like ScaleDB. The sweet spot for shared-disk probably peaks at about 80-100 database servers. This level of scaling should be sufficient for all but the largest companies. Beyond that, NoSQL (utilizing little or no scale-limiting constraints like forced consistency) will be the only option.
I would love to hear your thoughts in the comments section below…
13 hours ago