Apache Cassandra num_tokens ideal value for write heavy application - cassandra

I am setting up Cassandra 3.11 cluster and wanted to know what is the ideal value for num_tokens. By default, cassandra.yaml has 256. Some blogs on Cassandra suggests values for num_tokens to be 32. Do we see any performance issue with 256 num_tokens or can we have values like 16 or 32. How can we decide the optimal value? Please suggest.

num_tokens value as such as no impact on writing. Most of the issue with having large value of num_tokens is with operation maintenance (repair) and scanning issues while reading. High num_token value is found to create hotspots, also more parition ranges increase repair time. Community has reduced default value of num_tokens from 256 to 16 in Cassandra 4.0 release. You can see JIRA here.
You can experiment with lower values of num_tokens. Don't forget to use allocate_tokens_for_keyspace(Cassandra Documentation

Related

Relation between system specifications and Cassandra configuration parameters

Is there a relation between Cassandra's configuration parameters(given below with current values), Datastax's C++ driver configuration parameters(given below with current values) and the node's hardware specifications(no. of processors, RAM, no. of disks etc.)
Cassandra's Configuration Parameters(in YAML)
concurrent_reads set as 16
concurrent_writes set as 256
native_transport_max_threads set as 256
native_transport_max_frame_size_in_mb set as 512
Datastax's C++ Driver Configuration Parameters
cass_cluster_set_num_threads_io set as 10
cass_cluster_set_core_connections_per_host set as 1
cass_cluster_set_max_connections_per_host set as 20
cass_cluster_set_max_requests_per_flush set as 10000
Node's specs
No. of processors: 32
RAM: >150 GB
No. of hard disks: 1
Cassandra's Version: 3.11.2
Datastax C++ driver version: 2.7
RHEL version: 6.5
I have a cluster of 2 nodes and I've been getting dismal throughput(12000 ops/second). 1 operation = read + write(I can't use row cache). Is there any parameter which should've been set higher/lower(considering the nodes' specs)?
Please also note that my read+write application is multi-threaded(10
threads). Also, I'm doing asynchronous read+ asynchronous write(using future).
Replication factor is 2, both nodes are in the same DC, consistency
level for both read and write is also 2.
Some of the configuration properties in Cassandra are computed from available CPU cores and drives.
concurrent_reads = 16 * (number of drives)
concurrent_writes = 8 * (CPU cores)
It looks like you've done that, although I would question whether or not your 32 CPUs are all physical cores, or hyper-threaded.
I have a cluster of 2 nodes and I've been getting dismal throughput(12000 ops/second).
Just my opinion, but I think 12k ops/sec is pretty good. Actually REALLY good for a two node cluster. Cassandra scales horizontally, and linearly at that. So the solution here is an easy one...add more nodes.
What is your target operations per second? Right now, you're proving that you can get 6k ops/second per node. Which means, if you add another, the cluster should support 18K/sec. If you go to six nodes, you should be able to support 36k/sec. Basically, figure out your target, and do the math.
One thing you might consider, is to try ScyllaDB. Scylla is a drop-in replacement for Cassandra, which trumpets the ability to hit very high throughput requirements. The drawback, is that I think Scylla is only Cassandra 2.1 or 2.2 compatible ATM. But it might be worth a try based on what you're trying to do.

How to calculate row_cache_size_in_mb and key_cache_size_in_mb in cassandra?

I have a cluster with 3 DC with 3,3 and 1 nodes respectively. I want to tune the cluster for better performance and hence want to enable row_cache and set its size in cassandra.yaml but don't know how to calculate its size. I used the utility https://github.com/joaquincasares/heap-calculator but could not get the script avg_row_size_calculator is not supported with the cassandra version 3.0.
Is there any formula or utility to calculate the row_cache_size and key_cache_size?
Total available memory per node is 32G and MAX_HEAP_SIZE is 14G.

How Cassandra In-memory feature works

I have few questions with regards to the In-Memory feature in Cassandra
1.) I have a 4 node datacenter and in Opscenter, under memory usage , it shows there is 100GB of in-memory available. Does it mean that each of the 4 nodes have 100GB memory available or is the 100Gb the total in memory capacity for my datacenter?
2.) If really 100GB is available for In-Memory for a datacenter, is it advisable to use the full capacity? Do I need to factor replication factor as well? Say I have a 15GB data which I want to store it in In-Memory, if the replication factor is 2, will it be like we have 30GB of data in In-memory for the datacenter?
3.) In dse.yaml file, there is a property which has the value like percentage of system memory "max_memory_to_lock_fraction" and by default it is 20%. As per the guidelines from Datastax Cassandra, we need to ensure that the in memory usage does not exceed 45% of total available system memory for each node. Is this "max_memory_to_lock_fraction" the parameter that needs to be set for 45%?
4.) Datastax documentation says compression needs to be removed for In-memory table. If compression is indeed set, will it affect the read/write performance?
5.) Output of dsetool inmemorystatus has a parameter called "Current Total memory not able to lock". Is the value present in this parameter denote the available memory. Like say if the value is 1024MB, does it mean that still 1GB In-memory is available for use.
I am using DSE 4.8.11 version. Please help me as I am trying to understand this feature so as to leverage it best.
Thanks in advance.
1) It depends on how you configure it it can be per cluster (all of the available memory) or you can view graphs of individual nodes
2) Yes, replication factor increases data by factor times in total. You will have to factor that in on the cluster level. Very nice tool to help you start: https://www.ecyrd.com/cassandracalculator/
3) Yes max_memory_to_lock_fraction is what you are looking for
4) It will increase processing time, since writes in cassandra are actually cpu bound this might not be best performance wise idea.
5) Yes this means there is still memory (of specified amount), but due to settings cassandra is unable to lock it.

Cassandra cluster - Store equal data among the nodes

In Cassandra Cluster, how can we ensure all nodes are having almost equal data, instead one node has more data, another has very less.
If this scenario occurs, what are the best practices
Thanks
It is ok to expect a slight variation of 5-10%. The most common causes are the distribution of your partitions may not be truly random (more partitions on some nodes) and there may be a large variation in the size of the partitions (smallest partition is a few kilobytes but largest partition is 2GB).
There are also 2 other possible scenarios to consider.
SINGLE-TOKEN CLUSTER
If the tokens are not correctly calculated, some nodes may have a larger token range compared to others. Use the token generation tool to get a list of tokens that is correctly distributed around the ring.
If the cluster is deployed with DataStax Enterprise, the easiest way is to rebalance your cluster with OpsCenter.
VNODES CLUSTER
Confirm that you have allocated the same number of tokens in cassandra.yaml with the num_tokens directive.
Unless you are using ByteOrderedPartitioner for your cluster that should not happen. See DataStax documentation here for more information about available partitioners and why it should not (normally) happen.

Cassandra vnodes performance overhead and changing the number of vnodes

We have a test cluster of 4 nodes, and we've turned on vnodes. It seems that reading out is somewhat slower than the old method (initial_token). Is there some performance overhead by using vnodes? Do we have to increase/decrease the default num_tokens (256) if we only have 4 physical nodes?
Another scenario we would like to test is to change the num_tokens of the cluster on the fly. Is it possible, or do we have to recreate the whole cluster? If possible, how can we accomplish that?
We're using Cassandra 2.0.4.
It really depends on your application, but if you are running Spark queries on top of Cassandra, then a high number of vnodes can significantly slow down your queries, by at least 2x or 5x. This is because Spark cannot subdivide queries across vnodes, and each vnode results in one Spark partition, and a high number of partitions slows down low latency queries.
The recommended number of vnodes is more like 16. This lets you split a two node cluster in theory to 32 nodes max, which is more than enough of an expansion ratio for most folks.

Resources