Monday, January 7, 2013

Database Virtualization, What it Really Means


This is a response to a blog post by analyst and marketing consultant Curt Monash.

Originally virtualization meant running one operating system in a window inside of another operating system, e.g. running a Linux on a Windows machine using Microsoft Virtual PC or VMWare. Then virtualization evolved to mean slicing a single server into many for more granular resource allocation (Curt’s ex uno plures, translated: out of one, many). It has since expanded to include e pluribus unum (from many, one) and e pluribus ad pluribus (from many to many). This is evidenced in the use of the term “virtualization” to create the compound words: server virtualization, storage virtualization, network virtualization and now database virtualization.

Server Virtualization: Abstracts the physical (servers), presenting it as a logical entity or entities. VMWare enables dividing single physical resources (compute or storage) into multiple smaller units (one-to-many), as well as combining multiple physical units into a single logical unit, which they call clustering (many-to-one). Since a clustered collection of physical servers can address a clustered collection of physical storage devices, it therefore also supports the many-to-many configuration. If we extract the essence of virtualization it is the ability to address compute and storage resources logically, while abstracting the underlying physical representation.

This modern definition of virtualization is also evident in the following terms:

Storage Virtualization: Splitting a single disk into multiple virtual partitions (one-to many), a single logical view that spans multiple physical disks (RAID) and splitting multiple disks, often for high-availability, across multiple logical storage devices (mirroring or LUNs).  see also Logical Volume Management

Network Virtualization: “In computing, network virtualization is the process of combining hardware and software network resources and network functionality into a single, software-based administrative entity, a virtual network.” This is a many-to-one model. 

Virtual IP Addresses: This enables the application to have a single IP address that actually maps to multiple NICs. 

So this brings us to the topic of defining “Database Virtualization”. We believe that the most comprehensive description of database virtualization is the abstracting from physical resources (compute, data, RAM) to the logical representation, supporting many-to-one, one-to-many and many-to-many relationships. This is exactly what ScaleDB does better than any other database company, and this is why we are considered leaders in the nascent field of database virtualization.

ScaleDB provides a single logical view (of a single database) while that database is actually comprised of a cluster of multiple database instances operating over shared data. Whether you call this many-to-one (many database nodes acting as one logical database) or one-to-many (one logical database split across many nodes) is a matter of perspective. In either case, this enables independent scaling of both compute and I/O, eliminating the need for painful sharding, while also supporting multi-tenancy.

Curt then puts words in our mouths stating that we claim: “Any interesting database topology should be called “database virtualization”.” We make no such a claim. In fact, we state very clearly on our database virtualization page: “Database virtualization means different things to different people; from simply running the database executable in a virtual machine, or using virtualized storage, to a fully virtualized elastic database cluster composed of modular compute and storage components that are assembled on the fly to accommodate your database needs.”

In the marketing world, perception is reality. Since people are making claims of providing database virtualization, it is only prudent to include and compare their products, in a comprehensive evaluation of the space. Just as Curt addresses many things that are not databases (e.g. Memcached) in order to provide the reader with a comprehensive understanding, so do we when talking about database virtualization. One need only consider that we include “running the database executable in a virtual machine” as one of the approaches that some consider to be “database virtualization.” While we consider our approach to be the best solution for database virtualization, ignoring what other people consider to be database virtualization would have displayed extreme hubris on our part.

We appreciate Curt shining the light on database virtualization and we agree that it is “a hot subject right now.” It is a new field and therefore requires a comprehensive evaluation of all claims and approaches, enabling customers to decide which is the best approach for their needs. We remain quite confident that we will continue to lead the database virtualization market based upon our architectural advantages.

As the database virtualization market heats up, and as we enhance our solution set, we remain confident that Curt and other analysts will come to appreciate our unique advantages in this market.