Thursday, May 2, 2013

Thoughts on Xeround and Free!


Everybody loves free. It is the best marketing term one could use. Once you say “FREE” the people come running. Free makes you very popular. Whether you are a politician offering something for free, or a company providing free stuff, you gain instant popularity.

Xeround is shutting down their MySQL Database as a Service (DBaaS) because their free instances, while popular, simply did not convert into sufficient paid instances to support the company. While I am sad to see them fail, because I appreciate the hard work required to deliver database technology, this announcement was not unexpected.

My company was at Percona Live, the MySQL conference, and I had some additional conversations along these same lines. One previously closed source company announced that they were open sourcing their code, it was a very popular announcement. A keynote speaker mentioned it and the crowd clapped excitedly. Was it because they couldn’t wait to edit the code? Probably not. Was it because now the code would evolve faster? Probably not, since it is very low-level and niche oriented, and there will be few committers. No, I think it was the excitement of “free”. The company was excited about a 49X increase in web traffic, but had no idea what the impact would be on actual revenues.

I spoke with another company, also a low-level and niche product, and they have been open source from the start. I asked about their revenues, they are essentially non-existent. Bottom line is that the plan was for them to make money on services…well Percona, Pythian, SkySQL and others have the customer relationships and they scoop up all of the consulting and support revenue while this company makes bupkis. I feel for them.

I had a friend tell me that ScaleDB should open source our code to get more customers. Yes open source gets you a lot of free users…not customers. It is a hard path to sell your first 10...25…50…etc. customers, but the revenue from those customers fuels additional development and makes you a fountain of technology. Open source and free are great for getting big quickly and getting acquired, but it seems that if the acquisition doesn’t happen, then you can quickly run out of money using this model (see Xeround).

I realize that this is an unpopular position. I realize that everybody loves free. I realize that open source has additional advantages (no lock-in, rapid development, etc.), but in my opinion, open source works in only two scenarios: (1) where the absolute volume is huge, creating a funnel for conversion (e.g. Linux); (2) where you need to unseat an entrenched competitor and you have other sources of revenue (e.g. OpenStack).

I look forward to your comments. We also look forward to working with Xeround customers who are looking for another solution.

Monday, January 7, 2013

Database Virtualization, What it Really Means


This is a response to a blog post by analyst and marketing consultant Curt Monash.

Originally virtualization meant running one operating system in a window inside of another operating system, e.g. running a Linux on a Windows machine using Microsoft Virtual PC or VMWare. Then virtualization evolved to mean slicing a single server into many for more granular resource allocation (Curt’s ex uno plures, translated: out of one, many). It has since expanded to include e pluribus unum (from many, one) and e pluribus ad pluribus (from many to many). This is evidenced in the use of the term “virtualization” to create the compound words: server virtualization, storage virtualization, network virtualization and now database virtualization.

Server Virtualization: Abstracts the physical (servers), presenting it as a logical entity or entities. VMWare enables dividing single physical resources (compute or storage) into multiple smaller units (one-to-many), as well as combining multiple physical units into a single logical unit, which they call clustering (many-to-one). Since a clustered collection of physical servers can address a clustered collection of physical storage devices, it therefore also supports the many-to-many configuration. If we extract the essence of virtualization it is the ability to address compute and storage resources logically, while abstracting the underlying physical representation.

This modern definition of virtualization is also evident in the following terms:

Storage Virtualization: Splitting a single disk into multiple virtual partitions (one-to many), a single logical view that spans multiple physical disks (RAID) and splitting multiple disks, often for high-availability, across multiple logical storage devices (mirroring or LUNs).  see also Logical Volume Management

Network Virtualization: “In computing, network virtualization is the process of combining hardware and software network resources and network functionality into a single, software-based administrative entity, a virtual network.” This is a many-to-one model. 

Virtual IP Addresses: This enables the application to have a single IP address that actually maps to multiple NICs. 

So this brings us to the topic of defining “Database Virtualization”. We believe that the most comprehensive description of database virtualization is the abstracting from physical resources (compute, data, RAM) to the logical representation, supporting many-to-one, one-to-many and many-to-many relationships. This is exactly what ScaleDB does better than any other database company, and this is why we are considered leaders in the nascent field of database virtualization.

ScaleDB provides a single logical view (of a single database) while that database is actually comprised of a cluster of multiple database instances operating over shared data. Whether you call this many-to-one (many database nodes acting as one logical database) or one-to-many (one logical database split across many nodes) is a matter of perspective. In either case, this enables independent scaling of both compute and I/O, eliminating the need for painful sharding, while also supporting multi-tenancy.

Curt then puts words in our mouths stating that we claim: “Any interesting database topology should be called “database virtualization”.” We make no such a claim. In fact, we state very clearly on our database virtualization page: “Database virtualization means different things to different people; from simply running the database executable in a virtual machine, or using virtualized storage, to a fully virtualized elastic database cluster composed of modular compute and storage components that are assembled on the fly to accommodate your database needs.”

In the marketing world, perception is reality. Since people are making claims of providing database virtualization, it is only prudent to include and compare their products, in a comprehensive evaluation of the space. Just as Curt addresses many things that are not databases (e.g. Memcached) in order to provide the reader with a comprehensive understanding, so do we when talking about database virtualization. One need only consider that we include “running the database executable in a virtual machine” as one of the approaches that some consider to be “database virtualization.” While we consider our approach to be the best solution for database virtualization, ignoring what other people consider to be database virtualization would have displayed extreme hubris on our part.

We appreciate Curt shining the light on database virtualization and we agree that it is “a hot subject right now.” It is a new field and therefore requires a comprehensive evaluation of all claims and approaches, enabling customers to decide which is the best approach for their needs. We remain quite confident that we will continue to lead the database virtualization market based upon our architectural advantages.

As the database virtualization market heats up, and as we enhance our solution set, we remain confident that Curt and other analysts will come to appreciate our unique advantages in this market. 

Thursday, September 22, 2011

Lack of Business Visibility Cripples Traditional SQL DaaS, Drives NewSQL

More and more public cloud companies are moving to managed cloud services to improve their value-add (price premium) and the stickiness of their solution. However, the shift to a database as a service (DaaS) severely reduces the DBAs visibility into the business, thus limiting the ability to hand tune the database to the requirements of the application and the database. The solution is a cloud database that eliminates the hand-tuning of the database, thereby enabling the DBA to be equally effective even with limited visibility into the business and application needs. It is these unique needs, particularly for SQL databases, that is fueling the NewSQL movement.

DBAs traditionally have insight into the company, enabling them to hand tune the database in a collaborative basis with the development team, such as:

1. Performance Trade-offs/Tuning: The database is partitioned and tuned to address business requirements, maximizing performance of certain critical processes, while slowing less critical processes.

2. System Maintenance Planning: You need to shut down the database to do an application/database upgrade, repartition, etc. but you may need to coordinate with the development team, and the schedule may change up until the last moment.

3. Application Evolution: The database must be designed and tuned to accommodate the planned changes in the application.

4. Consulting with Application Developers: Since the database partitioning and performance are hand-tuned, the DBA must collaborate closely with the application development team on design, development and deployment.

5. Partitioning Requirements: When partitioning (or repartitioning) your database you’ll need to partition the data to suit application requirements avoiding things like cross partition joins, range scans, aggregates, etc., which can cause tremendous performance penalties if not implemented correctly.

6. Moving Processes to the Application Layer: Single server databases can handle joins, range scans, aggregates and the like internally. However, when you partition the database these functions are typically moved to the application layer. The application must add the logic to accomplish these things. It must also add the routing code to point to the correct partitions. As a result, an application written to a single server API does not work in a multi-server configuration.

When moving from a self-managed database—either in the cloud or on premise—to a DaaS, the “DBA-in-the-cloud” doesn’t have that visibility into the business requirements, performance requirements, development schedule, and more. This lack of visibility turns the already challenging task of hand-tuning the database into a near impossibility using traditional databases.

A real world example: You are the DBA-in-the-cloud. Your customer has been running on a single server, but they need to now scale-out across multiple database nodes to accommodate growth. How do you split the data? You don’t know which queries have higher performance priority. You don’t know the development plan for new version of the application. You don’t know when it is convenient to shut down the application to implement the partitioning. You need to inform the application developers how they should implement their routing code to send database requests to the right nodes. You need to inform the application developers that they need to handle joins, range scans and aggregates at the application level, since the database can no longer handle those.

DBAs have enough of a challenge scaling and maintaining databases when they have full visibility into the business. To address these challenges without that level of visibility is unrealistic. I refer to this problem as the “Blind DBA” challenge, because the DaaS DBA has a serious lack of visibility into the requirements and inner workings of the company.

ScaleDB is a NewSQL database uniquely designed to handle the Blind DBA challenge inherent in DaaS implementations. ScaleDB is based on a shared-disk architecture that scales in a manner that is invisible to the application. It eliminates the need to push functions (joins, range scans, aggregates) up to the application layer. It eliminates the need to coordinate your application and database design to work around partitions, because there are no partitions. It eliminates the performance tradeoffs in partitioning design, you get consistent performance across the database. It eliminates scheduling and coordinating application shut-down to repartition, because it doesn’t use partitions. In short, with ScaleDB the DBA needn’t have any visibility into the business to deliver optimal database performance via a single API. This is what makes ScaleDB an ideal solution for cloud companies implementing the database as a managed service or DaaS.

Tuesday, September 20, 2011

Cloud DaaS Managed Service Fuels NewSQL Market

As public clouds are commoditized, the public cloud vendors are increasingly moving to higher margin and stickier managed services. In the early days of the public cloud, renting compute and storage was unique, exciting, sticky and profitable. It has quickly become a commodity. In order to provide differentiation, maintain margins and create barriers to customer exit, against increasing competition, the cloud is moving toward a collection of managed services.

Public clouds are growing beyond simple compute instances to platform as a service (PaaS). PaaS is then comprised of various modules, including database as a service (DaaS). In the early days you rented a number of compute instances, loaded your database software and you were the DBA managing all aspects of that database. Increasingly, public clouds are moving toward a DaaS model, where the cloud customer writes to a simple database API and the cloud provider is the DBA.

If the database resides in a single server and does not require high-availability, providing that as a managed service is no problem. Of course, if this is the use case, then it is no problem for the customer to manage their own database. In other words, there is little value to a managed service.

The real value-add for the customer, and hence the real price premium, is derived by offering things like auto-scaling across multiple servers, hot backup, high-availability, etc. If the public cloud provider can offer a SQL-based DaaS, where the customer writes to a simple API and everything else is handled for them, that is a tremendous value and customers will pay a premium for it.

While this sounds simple, public cloud companies soon learn that the Devil is in the details. Managing someone else’s database, without insight into their business processes, performance demands, scaling demands, evolving application requirements, and more, is extremely challenging and demands a new class of DBMS. These demands have created a market need that is now being filled by companies using the moniker “NewSQL”.

In short, when it comes to DaaS, public cloud vendors want the following:
• Simple “write to our API and we’ll handle the messy stuff like scaling, HA, etc.”
• Premium value that translates to a higher profit margin business
• Barriers to customer exit

Future posts will delve into the operational demands of DaaS, and how these demands a driving NewSQL DBMS architectures and features.

Wednesday, August 24, 2011

The Future of NoSQL (Companies)…

A friend recently bought a GM car. I proceeded to inform him that I am shorting GM stock (technically a put option). He was shocked. “But they make great cars,” he exclaimed. I responded, “I’m not shorting the cars, I’m shorting the company.” Why am I recounting this exchange? Because I believe that the new wave of NoSQL companies—as opposed to the rebranded ODBMS—presents the same situation. I am long the products, but short the companies.

Let me explain. NoSQL companies have built some very cool products that solve real business problems. The challenge is that they are all open source products serving niche markets. They have customer funnels that are simply too small to sustain the companies given their low conversion/monetization rates.

These companies could certainly be tasty acquisition targets for companies that actually make money. But as standalone companies, sadly, I would short them. On that note, I am off to the NoSQL Now! Conference. Hopefully, this post won't get me beat-up while cruising the conference.


Monday, August 15, 2011

Do you need an elastic database?

Not every company or application needs an elastic database. Some applications can get by just fine on a single database server, rendering database elasticity moot from their perspective. To make this determination, simply ask yourself:

1. Will I need more than a single database server?
Look at your current load and your projected growth and ask yourself whether it will exceed the capacity of a single server. If it doesn’t now, nor will it in the future, then you don’t need an elastic database.

2. Will my load fluctuate sufficiently to warrant the investment in elasticity?
If your database requirements won’t experience fluctuations in demand—e.g. daily, weekly, monthly, seasonal changes in the number of servers required—then elasticity isn’t important. For example, if you have a social networking application that requires 2 database nodes 24x7, but peaks at 10 nodes for 2 hours a night, then elasticity is important. If your database has a steady load that requires 3 database nodes and that load doesn’t change, then elasticity isn’t important.

3. Will my database load grow with time?
I your database load will grow with time, then you need to evaluate the growth pattern and ask yourself whether you can handle this expansion manually, or if you simply want to rely on an elastic database to grow seamlessly.

If your answer to these three questions is no, then you don’t really need an elastic database. Now there are certainly other reasons to consider ScaleDB (e.g. high-availability, eliminate partitioning/sharding, eliminate master-slave issues, etc.) but elasticity may not be one of them.

Wednesday, August 3, 2011

Cloud Elasticity & Databases

The primary reasons people are moving to the public cloud are: (1) replace capital expenses with operating expenses (pay as you go); (2) use shared resources for processes like back-up, maintenance, networking (shared expenses); (3) use shared infrastructure that enables you to pay only for those resources you actually use, instead of consuming your maximum load resources at all times (pay-per-use). The first thing you’ll notice is that all 3 cloud benefits have their basis in finances or the cloud business model.

We will focus in on #3 above: Pay-Per-Use. The old school model was to build your compute infrastructure for the maximum load today, plus growth over the life-cycle of the equipment, plus some buffer so the systems don’t get overloaded from spikes in usage. The net result is that your average usage might run 10% of the potential for the infrastructure you mortgaged your home to buy. In other words, you were paying 10X more than you would pay if you only paid by usage. In reality, you might pay half as much to run on the cloud, with the balance of the savings going to the cloud company in the form of profits. This works and it is a win-win for both you and the public cloud.

To achieve this Pay-Per-Use ideal, and the compelling financial advantages it enables, the infrastructure must scale elastically. You must be able to add compute power seamlessly and on the fly, without shut-down. How important is this elasticity? Amazon named their service “EC2” for “Elastic Cloud Computing”. Elastic is the first word, I would say it is pretty important. Besides, if the cloud weren’t elastic, you would simply be paying for the same computer costs, plus the public cloud company’s markup for expenses and profit.

So how elastic are public clouds? The entire cloud stack is elastic, except for one piece, the SQL database. Cloud companies recognized that the SQL database was the Achilles heel of cloud elasticity. To address this problem, they created NoSQL, which delivers database-like capabilities, but removes the things that make a SQL database inelastic; namely SQL, ACID-compliance, data consistency, transactions, etc.

NewSQL appears to be the response from the database vendors, who believe that there is a market for SQL databases that provide cloud elasticity. Not all NewSQL solutions address elasticity, but a few of us do. In my next blog post, I’ll address whether or not database elasticity is important…hint: it depends upon your needs.