Tuesday, August 10, 2010

Comparing ScaleDB’s Shared Cache Tier vs. NFS and CFS

Prior posts addressed the performance benefits of a shared cache tier (ScaleDB CAS) and also the storage flexibility it enables.This post compares the ScaleDB CAS purpose-built file storage sharing system against off-the-shelf solutions like NFS and various cluster file systems (CFS).

When using a clustered database, like ScaleDB, each node has full access to all of the data in the database. This means that the file system (SAN, NAS, Cloud, etc.) must allow multiple nodes to share the data in the file system.

Options include:
1. Network File System (NFS)
2. Cluster File System (CFS)
3. Purpose-built file storage interface

Locking Granularity:
I won’t get deeply into the nuances of CFS (block-level ) and NFS (file-level, but you can address within the file), suffice it to say that generally speaking NFS and CFS will allow you operate on blocks of data, which are typically 8KB. Let’s say you want to operate on a record that is 200 bytes within an 8KB block. You are locking 8KB instead of 200 bytes, or 40X more than necessary.

ScaleDB’s CAS uses a purpose-built interface to storage that is optimized to leverage insight from the cluster lock manager. This enables it to lock the storage on the record level. In situations where multiple nodes are concurrently accessing data from the same block, this can be a significant performance advantage. This reduces the contention between threads/nodes enabling superior performance and nodal scalability.

Intelligent Control of RAM vs. Disk:
When writing data to storage, you can either flush it directly to disk or you can store it in cache, allowing the disk flushing to occur later, outside of the transaction. Some things, like log writing require the former, while other things work just fine (and faster) with the latter. Unfortunately, generic file systems like NFS and CFS are not privy to this insight, so they must err on the side of caution and flush everything to disk inside the transaction.

ScaleDB’s CAS is privy to the intelligence inside the database. It is therefore able to push more data into cache for improved performance. Furthermore, this optimization can be configured by users, based on their own requirements. The net result is superior performance.

Conclusion:
As general purpose solutions, NFS and CFS cannot benefit from the insight and intelligence from the internal operation of the database. Instead, NFS and CFS must act in a generalized manner. ScaleDB’s Cluster Accelerator Server (CAS), leverages insight gleaned from the cluster lock manager, and from user configurations, to optimize its interaction with storage. This makes CAS more efficient and scalable, and it improves performance.

No comments:

Post a Comment