Cassandra reads slow with multiple nodes - cassandra

I have a three node Cassandra cluster with version 2.0.5.
RF=3 and all data is synced to all three nodes.
I read from cqlsh with Consistency=ONE.
When I bring down two of the nodes my reads are twice as fast than when I have the entire cluster up.
Tracing from cqlsh shows that the slow down on the reads with a full cluster up occurs when a request is forwarded to other nodes.
All nodes are local to the same datacenter and there is no other activity on the system.
So, why are requests sometimes forwarded to other nodes?
Even for the exact same key if I repeat the same query multiple times I see that sometimes the query executes on the local node and sometimes it gets forwarded and then becomes very slow.

Assuming that the cluster isn't overloaded, Cassandra should always prefer to do local reads when possible. Can you create a bug report at https://issues.apache.org/jira/browse/CASSANDRA ?

This is due to read repair.
By default read repair is applied for all the read with consistency level quorum or with 10% chance for lower consistency levels, that's why for consistency level one sometimes you see more activity and sometime less activity.

Related

cassandra enable hints and repair

I am adding a new node to my cassandra cluster which is currently 5 nodes. The nodes have hints turned on and I am also running repairs using cassandra reaper. When adding the node node, the node addition is taking foreever and the other nodes are becoming unresponsive. I am running cassandra 3.11.13.
questions
As I understand hints are used to make sure writes are correctly propagated to all replicas
Cassandra is designed to remain available if one of it’s nodes is down or unreachable. However, when a node is down or unreachable, it needs to eventually discover the writes it missed. Hints attempt to inform a node of missed writes, but are a best effort, and aren’t guaranteed to inform a node of 100% of the writes it missed.
repairs do something similar
Repair synchronizes the data between nodes by comparing their respective datasets for their common token ranges, and streaming the differences for any out of sync sections between the nodes.
If I am running repairs with cassandra reaper, do I need to disable hints?
If hints are enabled and repairs are carried. Does it cause double writes of data in nodes?
Is it okay to carry repair while a node is joining?

Hazelcast High Availability in case of 3 nodes cluster

We are using Hazelcast IMDG as an in memory grid. The number of nodes in our cluster is three, and we have one sync backup and the cluster is partition aware. In that case, I expect the distributed map will be distributed across 3 nodes (more or less) homogeneously. In case of a node break down, the leadership should be transferred to a healthy node(which has the sync backup for the lost data). If there is a write request to this newly assigned leader node, the same partition should be replicated synchronously to one of the alive nodes. Does it mean that in case of node failure, approximately one third of the distributed map should be replicated and during the replication time, all reads are blocked? Availability is hit if one of three node is down in case of one sync backup till approximately one third of the distribution is restored?
If a node goes down, the cluster will promote the backup partitions to primary.
And the migrations will start to create backups of these new primary partitions.
Please check the Data Partitioning section.
During migrations, read operations are not blocked.
Only write operations are blocked on the partition that is actively migrating.
Since the partitions are migrated one by one, the effect on availability is minimal.

How cassandra improve performance by adding nodes?

I'm going build apache cassandra 3.11.X cluster with 44 nodes. Each application server will have one cluster node so that application do r/w locally.
I have couple of questions running in my mind kindly answer if possible.
1.How many server Ip's should mention in seednode parameter?
2.How HA works when all the mentioned seed node goes down?
3.What is the dis-advantage to mention all the serverIP's in seednode parameter?
4.How cassandra scales with respect to data other than(Primary key and Tunable consistency). As per my assumption replication factor can improve HA chances but not performances.
then how performance will increase by adding more nodes?
5.Is there any sharding mechanism in Cassandra.
Answers are in order:
It's recommended to point to at least to 2 nodes per DC
Seed/contact node is used only for initial bootstrap - when your program reaches any of listed nodes, it "learns" the topology of whole cluster, and then driver listens for nodes status change, and adjust a list of available hosts. So even if seed node(s) goes down after connection is already established, driver will able to reach other nodes
it's harder to maintain usually - you need to keep a configuration parameters for your driver & list of nodes in sync.
When you have RF > 1, Cassandra may read or write data from/to any replica. Consistency level regulates how many nodes should return answer for read or write operation. When you add the new node, the data is redistributed to new node, and if you have correctly selected partition key, then new node start to receive requests in parallel to old nodes
Partition key is responsible for selection of replica(s) that will hold data associated with it - you can see it as a shard. But you need to be careful with selection of partition key - it's easy to create too big partitions, or partitions that will be "hot" (receiving most of operations in cluster - for example, if you're using the date as partition key, and always writing reading data for today).
P.S. I would recommend to read DataStax Architecture guide - it contains a lot of information about Cassandra as well...

what happens when an entire cassandra cluster goes down

I have a cassandra cluster having 3 nodes with replication factor of 2. But what would happen if the entire cassandra cluster goes down at the same time. How read and write can be manage in this situation and what would be the best consistency level so that i can manage my cassandra nodes for high availability, As of now i'm using QUORUM.
If your cluster is down on all nodes - it is down.
When you need HA, think of deploying more than one datacenter, so availability can be maintained even when an entire datacenter/rack goes down.
If you can live with stale data, you could use CL.ONE instead - you need only one node to respond.
More replicas also increases availability for CL.QUORUM - you need RF/2+1 nodes from your replicas alive, in case of 2 -> 2/2+1 = 2 or all your replicas need to be online. With RF=3 you sill only need 2 as 3/2+1 = 2 - now you can have one node down.
As for your writes - all acknowleged writes will be written to disk in the commitlog (if there is no caching issue on your disks) and restored when coming back online. Of course there may be a race condition where the changes are written to disk but not acked via network.
Keep in mind to setup NTP!

Can a Cassandra cluster serve as a replacement for an in-memory Redis key-value store?

My application crawls user's mailbox and saves it to an RDBMS database. I started using Redis as a cache (simple key-value store) for RDBMS database. But gradually I started storing crawler states and other data in Redis that needs to be persistent. Loosing this data means a few hours of downtime. I must ensure airtight consistency for this data. The data should not be lost in node failures or split brain scenarios. Strong consistency is a must. Sharding is done by my application. One Redis process runs on each of ten EC2 m4.large instances. On each of these instances. I am doing up to 20K IOPS to Redis. I am doing more writes than reads, though I have not determined the actual percentage of both. All my data is completely in memory, not backed by disk.
My only problem is each of these instances are SPOF. I cannot use Redis cluster as it does not guarantee consistency. I have evaluated a few more tools like Aerospike, none gives 'No data loss guarantee'.
Cassandra looks promising as I can tune the consistency level I want. I plan to use Cassandra with a replication factor 2, and a write must be written to both the replicas before considered committed. This gives 'No data loss guarantee.
By launching enough cassandra nodes (ssd backed) can I replace my Redis key-value store and still get similar read/write IOPS and
latency? Will opensource cassandra suffice my use case? If not, will the Datastax enterprise In-Memory version solve it?
EDIT 1:
A bit of clarification:
I think I need to use Write consistency level 'ALL' and Read consistency level 'One'. I understand that with this consistency level my cluster will not tolerate any failure. That is OK for me. A few minutes of downtime occasionally is not a problem, as long as my data is consistent. In my present setup, one Redis instance failure causes a few hours of downtime.
I must ensure airtight consistency for this data.
Cassandra deals with failure better when there are more nodes. Assuming your case allows for having more nodes, this is my suggestion.
So, if you have 5 nodes, use CL of QUORUM for both READ and WRITE. What it means is that you always write to at least 3 nodes and read from 3 nodes.(for 5 nodes , QUORUM is 3).
This ensures a very high level consistency
Also ensures limited downtime. Even if a node is down your writes and reads won't break.
If you use CL ALL, then even if one node is down or overloaded, you will have to take a full downtime.
I hope it helps!

Resources