Why your Cassandra needs local NVMe and Rok

October 29, 2019

Do you run a NoSQL database, like Cassandra or MongoDB, on the cloud or on-prem and it is terribly slow?

Probably, you are already paying a lot of money for the infrastructure, and slow is definitely something you don’t need.

Cassandra is a very popular NoSQL database, and it is widely deployed on AWS (Amazon Web Services) and GCP (Google Cloud Platform). The insoluble dilemma of deployment, which is constantly discussed the past few years, is whether you should be running Cassandra over local (ephemeral disks on AWS, local SSDs on GCP, local DAS on prem) or shared storage (EBS on AWS, PDs on GCP, SDS/SAN on prem).

The former option offers great performance, but no flexibility, while the latter is very flexible, but cripples performance and adds cost.

DataStax documentation recommends running Cassandra on ephemeral disks instead of using shared storage. However, a lot of people (see companies like Spotify, CrowdStrike, Librato) have been moving to shared storage lately, to gain from its flexibility. For example to migrate nodes easily, and take backups.

In terms of CPU and RAM, you have a wide range of choices. The cloud providers offer various types of instances, so that won’t be a problem. But what about IOPS? The flexible, shared storage that cloud providers offer is great for keeping your data safe and always available. However, it comes with an important downside, it is poor in IOPS.

What if you could have your application running on extremely fast, local NVMe SSDs, while keeping the flexibility of shared storage at the same time?

Well this is what we are building, and we call it Rok.

Rok is decentralized storage for the cloud native world. We believe modern, highly mobile apps need to discover persistent data instantly and access it fast, anywhere they run. That’s why we designed the first software product to combine the performance of local storage with the flexibility of shared storage, while enabling seamless collaboration on data, across your global infrastructure. Rok allows you to run your stateful containers over fast, local NVMe storage on-prem or on the cloud, and still be able to snapshot the containers and distribute them efficiently: across machines of the same cluster, or across distinct locations and administrative domains over a decentralized network. We think performance and flexibility shouldn’t be mutually exclusive anymore. One should have both. Everywhere.

So, by running Cassandra with Rok you get all the advantages of running over local NVMe storage:

extremely high IOPS
I/O latency in the order of μs
massive scale-out, with excellent scalability as the cluster grows
significant cost savings compared to running over shared storage

while keeping all the advantages of shared storage:

local backups
offsite backups
node migrations

Moreover, you get something that wasn’t possible before:

collaboration at global scale

This means that you can take a snapshot of your NoSQL database along with its data, and share it with another user of a completely distinct administrative domain, at a distinct location. This is a perfect match for test & dev use cases, analytics, or forensics.

*The answer is: local storage with the flexibility of shared storage using Rok*

Now that I have your attention, I will let the numbers speak for themselves.

I will present an analysis of cost and performance for a 100TB cluster on AWS and GCP.

Cassandra on AWS

We are going to compare the cost and performance of two Cassandra clusters on AWS. Each one has 100TB of raw capacity.

	Shared Storage	NVMe + Rok	Comparison
Storage Type	EBS (io1)	Local NVMe	–
Storage Capacity (raw)	100 TB	100 TB	–
Instance Type	c4.4xlarge	i3.4xlarge	–
Number of instances	27	27	–
Total vCPUs	432	432	same
Total GB of RAM	810	3,294	4x better
Nominal aggregate write IOPS	432 K	9,720 K	22x better
Nominal aggregate read IOPS	432 K	22,275 K	51x better
Cost per month	$50,838	$18,016	64% cheaper

Comparison of running Cassandra cluster on AWS over shared storage (EBS) and local NVMe shows that the latter approach results in 22 times more nominal aggregate write IOPS, 51 times more nominal aggregate read IOPS, and 64% cost reduction

We can see that using Rok and local NVMe-backed instances on AWS, you get more than 51x the nominal aggregate read IOPS, and more than 22x the nominal aggregate write IOPS, with over 60% cost reduction, keeping all the flexibility you need.

Cassandra on GCP

We are going to compare cost and performance of two Cassandra clusters on GCP. Each one has 100TB of raw capacity.

	Shared Storage	NVMe + Rok	Comparison
Storage Type	PDs	Local NVMe	–
Storage Capacity (raw)	100 TB	100 TB	–
Instance Type	n1-standard-8	n1-standard-8	–
Number of instances	34	34	–
Total vCPUs	272	272	same
Total GB of RAM	1,020	1,020	same
Nominal aggregate write IOPS	510 K	12,240 K	24x better
Nominal aggregate read IOPS	510 K	23,120 K	45x better
Cost per month	$23,942	$16,178	32% cheaper

Comparison of running Cassandra cluster on GCP over shared storage (PDs) and local NVMe shows that the latter approach results in 24 times more nominal aggregate write IOPS, 45 times more nominal aggregate read IOPS, and 32% cost reduction.

We can see that using Rok and local NVMe-backed instances on GCP, you get more than 45x the nominal aggregate read IOPS, and 24x the nominal aggregate write IOPS, with more than 30% cost reduction, keeping all the flexibility you need.Comparison of running Cassandra cluster on GCP over shared storage (PDs) and local NVMe shows that the latter approach results in 24 times more nominal aggregate write IOPS, 45 times more nominal aggregate read IOPS, and 32% cost reduction.

Conclusion

Now, that the Rok data management platform adds the flexibility you need on this setup, and with the NVMe prices going down, running over local NVMe storage is a very compelling option. We strongly recommend NVMe-backed instances combined with Rok; the performance boost you will experience along with the associated cost savings will surprise you.

If you have any questions, or want to learn more about the proposed solution, don’t hesitate to drop us a line at contact@arrikto.com or get started with MiniKF.