Cassandra in-memory configuration - cassandra

We currently evaluate the use of Apache Cassandra 1.2 as a large scale data processing solution. As our application is read-intensive and to provide users with the fastest possible response time we would like to configure Apache Cassandra to keep all data in-memory.
Is it enough to set the storage option caching to rows_only on all column families and giving each Cassandra node sufficient memory to hold its data portion? Or are there other possibilities for Cassandra ?

Read performance tuning is much complex than write. Base on my experiences, there are some factors you can take into consideration. Some point of view are not memory related, but they also help improve the read performance.
1.Row Cache: avoid disk hit, but enable it only if the rows are not updated frequently. You could also enable the off-heap row cache to reduce the JVM heap usage.
2.Key Cache: enable by default, no need to disable it. It avoid disk searching when row cache is not hit.
3.Reduce the frequency of memtable flush: adjust memtable_total_space_in_mb, commitlog_total_space_in_mb, flush_largest_memtables_at
4.Using LeveledCompactionStrategy: avoid a row spread across multiple SSTables.

DataStax has added an in-memory computing feature in the latest version of its Apache Cassandra-based NoSQL database, as part of a drive to increase the performance of online applications.
Reference :


What is the difference between scylla read path and cassandra read path?

What is the difference between Scylla read path and Cassandra read path? When I stress Cassandra and Scylla then Scylla read performance poor by 5 times than Cassandra using 16 core and normal HDD.
I expect better read performance on Scylla compared to Cassandra using normal HDD, because my company doesn't provide SSD's.
Can someone please confirm, is it possible to achieve better read performance using normal HDD or not?
If yes, what changes required scylla config?. Please guide me!
Some other responses focused on write performance, but this isn't what you asked about - you asked about reads.
Uncached read performance on HDDs is bound to be poor in both Cassandra and Scylla, because reads from disk each requires several seeks on the HDD, and even the best HDD cannot do more than, say, 200 of those seeks per second. Even with a RAID of several of these disks, you will rarely be able to do more than, say, 1000 requests per second. Since a modern multi-core can do orders of magnitude more CPU work than 1000 requests per second, in both Scylla and Cassandra cases, you'll likely see free CPU. So Scylla's main benefit, of using much less CPU per request, will not even matter when the disk is the performance bottleneck. In such cases I would expect Scylla's and Cassandra's performance (I am assuming that you're measuring throughput when you talk about performance?) should be roughly the same.
If, still, you're seeing better throughput from Cassandra than Scylla, there are several details that may explain why, beyond the general client mis-configuration issues raised in other responses:
If you have low amounts of data, that can fit in memory, Cassandra's caching policy is better for your workload. Cassandra uses the OS's page cache, which reads whole disk pages and may cache multiple items in one read, as well as multiple index entries. While Scylla works differently, and has a row cache - only caching the specific data read. Scylla's caching is better for large volumes of data that do not fit in memory, but much worse when the data can fit in memory, until the entire data set has been cached (after everything is cached, it becomes very efficient again).
On HDDs, the details of compaction are very important for read performance - if in one setup you have more sstables to read, it can increase the number of reads and lower the performance. This can change depending on your compaction configuration, or even randomly (depending on when compaction was run last). You can check if this explains your performance issues by doing a major compaction ("nodetool compact") on both systems and checking the read performance afterwards. You can switch the compaction strategy to LCS to ensure that random-access read performance is better, at the cost of more write work (on HDDs, this can be a worthwhile compromise).
If you are measuring scan performance (reading an entire table) instead of reading individual rows, other issues become relevant: As you may have heard, Scylla subdivides each nodes into shards (each shard is a single CPU). This is fantastic for CPU-bounded work, but could be worse for scanning tables which aren't huge, because each sstable is now smaller and the amount of contiguous data you can read before needing to seek again is lower.
I don't know which of these differences - or something else - is causing performance of your use-case to be lower in Scylla, but I please keep in mind that whatever you fix, your performance is always going to be bad with HDDs. With SDDs, we've measured in the past more than a million random-access read requests per second on a single node. HDDs cannot come anything close. If you really need optimum performance or performance per dollar, SDDs are really the way to go.
There can be various reasons why you are not getting the most out of your Scylla Cluster.
Number of concurrent connections from your clients/loaders is not high enough, or you're not using sufficient amount of loaders. In such case, some shards will be doing all the work, while others will be mostly idle. You want to keep your parallelism high.
Scylla likes have a minimum of 2 connections per shard (you can see the number of shards in /etc/scylla.d/cpuset.conf)
What's the size of your dataset? Are you reading a large amount of partitions or just a few? You might be hitting a hot partition situation
I strongly recommend reading the following docs that will provide you more insights:
#Sateesh, I want to add to the answer by #TomerSan that both Cassandra and ScyllaDB utilize the same disk storage architecture (LSM). That means that they have relatively the same disk access patterns because the algorithms are largely the same. The LSM trees were built with the idea in mind that it is not necessary to do instant in-place updates. It consists of immutable data buckets that are large continuous pieces of data on disk. That means less random IO, more sequential IO for which the HDD works great (not counting utilized parallelism by modern database implementations).
All the above means that the difference that you see, is not induced by the difference in how those databases use a disk. It must be related to the configuration differences and what happens underneath. Maybe ScyllaDB tries to utilize more parallelism or more aggressively do compaction. It depends.
In order to be able to say anything specific, please share your tests, envs, and configurations.
Both databases use LSM tree but Scylla has thread-per-core architecture on top plus we use O_Direct while C* uses the page cache. Scylla also has a sophisticated IO scheduler that makes sure not to overload the disk and thus scylla_setup runs a benchmark automatically to tune. Check your output of it in io.conf.
There are far more things to review, better to send your data to the mailing list. In general, Scylla should perform better in this case as well but your disk is likely to be the bottleneck in both cases.
As a summary I would say Scylladb and cassandra have the same read / write path
memtable, commitlog, sstable.
However implementation is very different:
- cassandra rely on OS for low level IO and network (most DBMS does)
- scylladb rely on its own lib (seastar) to handle IO and network at a low level independently from OS page cache etc. This is why they can provide feature such as workload scheduling within the same cluster that would be very hard to implement in cassandra.

apache-spark-Cache table in Memory is spilling over to disk

How to pin the table in cache so it would not swap out of memory?
Situation: We are using Microstrategy BI reporting. Semantic layer is built. We wanted to Cache highly used tables into CACHE using Spark SQL CACHE Table ; we did cache for SPARK context( Thrift server). Initially it was all in cache , now some in cache and some in disk. That disk may be local disk relatively more expensive reading than from s3. Queries may take longer and inconsistent times from user experience perspective. If More queries running using Cache tables, copies of the cache table images are copied and copies are not staying in memory causing reports to run longer. so how to pin the table so would not swap to disk. Spark memory management is dynamic allocation, and how to use those few tables to Pin in memory .
There's a couple different ways to tackle this. First keep in mind that in memory storage is shared between storage and execution. so doing big joins/etc that may require temp storage may be competing for memory space. You may want to look at "spark.memory.storageFraction" which currently defaults to 0.5 Consider 0.75 but this will likely slow down your queries. Also consider applying good data engineering to the problem. Reduce the amount of data that needs to be stored. Create a temp view with old records removed and unneeded columns pruned, then cache that. Consider using smaller datatypes for improved storage. Ex ints are more space efficient than big strings. Lastly considering switching to an instance type that has more memory available or switching to an instance with fast local disks. In some situations disk storage isn't that much slower than in memory. This is particularly true if you're running big complicated analytical queries where the cluster is cpu bound and not io bound.

What are the impacts of high value row cache?

Recently I have gone through a tutorial about key cache and row cache. Can anyone help me with some real time examples where these caches can impact? And what is the impact if we increase these values in the config file?
On using desc table I found this
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
Your main concern is the memory profile of your application.
This diagram demonstrates how the key cache optimises the readpath, it allows us to skip the partition summary and partition index, and go straight to the compression offset. As for the row cache, if you get a hit, you've got your answer and don't need to go down the read path at all.
Key cache - The key cache is on by default as it only keeps the key of the row. Keys are typically smaller relative to the rest of the row so this cache can hold many entries before it's exhausted.
Row cache - The row cache holds an entire row and is useful when you have a fairly static querying pattern. The argument for the row cache is that if you read the same rows over and over, you can just keep them in memory rather going to the SSTable (storage medium) level and thus bypass an expensive seek on the read path. In practice the memory slow downs caused by usage of the row cache in non-optimal use-cases makes it an unpopular feature.
So what happens if you fill up the cache? Well, there's an eviction policy but if you're constantly kicking stuff out of either cache to make room for new items, then the caches won't exactly be useful as the gc related performance degradation will hurt overall performance.
What about having very high cache values? This is where there are better alternatives, more on this later. Making the row cache huge would just lead to GC issues, which depending on what you're doing exactly, typically leads to an overall net-loss in performance.
One idea I've seen being utilised relatively well is having a caching layer on top of Cassandra, such as Apache Ignite or Memcached. You load hot data in the caching layer to get fast READs and you write with an application that writes to the cache layer then to C* for persistence. These architectures come with many of their own headaches but if you want to cache data for lower query latencies, the C* row cache isn't the best tool for the job.

Cassandra vs Cassandra+Ignite

(Single Node Cluster)I've got a table having 2 columns, one is of 'text' type and the other is a 'blob'. I'm using Datastax's C++ driver to perform read/write requests in Cassandra.
The blob is storing a C++ structure.(Size: 7 KB).
Since I was getting lesser than desirable throughput when using Cassandra alone, I tried adding Ignite on top of Cassandra, in the hope that there will be significant improvement in the performance as now the data will be read from RAM instead of hard disks.
However, it turned out that after adding Ignite, the performance dropped even more(roughly around 50%!).
Read Throughput when using only Cassandra: 21000 rows/second.
Read Throughput with Cassandra + Ignite: 9000 rows/second.
Since, I am storing a C++ structure in Cassandra's Blob, the Ignite API uses serialization/de-serialization while writing/reading the data. Is this the reason, for the drop in the performance(consider the size of the structure i.e. 7K) or is this drop not at all expected and maybe something's wrong in the configuration?
Cassandra: 3.11.2
RHEL: 6.5
Configurations for Ignite are same as given here.
I got significant improvement in Ignite+Cassandra throughput when I used serialization in raw mode. Now the throughput has increased from 9000 rows/second to 23000 rows/second. But still, it's not significantly superior to Cassandra. I'm still hopeful to find some more tweaks which will improve this further.
I've added some more details about the configurations and client code on github.
Looks like you do one get per each key in this benchmark for Ignite and you didn't invoke loadCache before it. In this case, on each get, Ignite will go to Cassandra to get value from it and only after it will store it in the cache. So, I'd recommend invoking loadCache before benchmarking, or, at least, test gets on the same keys, to give an opportunity to Ignite to store keys in the cache. If you think you already have all the data in caches, please share code where you write data to Ignite too.
Also, you invoke "grid.GetCache" in each thread - it won't take a lot of time, but you definitely should avoid such things inside benchmark, when you already measure time.

Cassandra as distributed cached data store

Can we use Cassandra as a distributed in-memory cache database by utilizing its file level caching, key cache, and row cache?
I don't want to overload each node and I want to add more nodes to the cluster when the data grows to make this effective (to let most of my data be cached). Especially since 40% of my column families are static, and updates/insertions to other tables are not much.
The primary aim of ours is that we need an elastic realtime data store (faster around as in memory dB)
Cassandra was not born for the goal but after many optimizations it has become also a tool for in-memory caching. There are a few experiments -- the most significant I know is the one reported by Netflix. In Netflix they replaced their EVCache system (whom was persisted by a Cassandra backend) with a new SSD cassandra-based cache architecture -- the results are very impressive in term of performance improvements and cost-reduction.
Before choosing Cassandra as a replacement for any cache system I'd recommend to deeply understand the usage of row-caching and key-caching. More, I've never used Datastax Enterprise but it has an interesting in memory table feature.
I guess you could but I don't think that's correct use-case for Cassandra. Without knowing more about your requirements, I'd recommend you have a look at products like e.g. Hazelcast which is an in-memory distributed cache and sounds more like a fit for your use-case.
I know its a little late but I've just come accross this post doing some research on Cassandra.
I've seen success with Tibco's AST (recently rebranded to DTM) for in memory caching.
I've also played around with Pivotal's gemfire (this uses Geode under the covers), which has shown some promise.
