Cassandra QUORUM write consistency level and multiple DC - cassandra

I'm a bit confused about how QUORUM write selects nodes to write into in case of multiple DC.
Suppose, for example, that I have a 3 DC cluster with 3 nodes in each DC, and the replications factor is 2, so that the number of replicas needed to achieve QUORUM is 3. Note: this is just an example to help me formulate my question and not the actual configuration.
My question is the following: in case of write, how these 3 replicas will be distributed across all the DCs in my cluster? Is it possible that all 3 replicas will end up in the same DC?

The replication is defined at the key space level. So for example
create keyspace test with replication = { 'class' : 'NetworkTopologyStrategy', 'DC1' : 2, 'DC2' : 2, 'DC3' : 2 };
As you can see clearly each DC will hold two copies of data for that keyspace and not more. You could have another key space in the same cluster defined only to replicate in one DC and not the other two. So its flexible.
Now for consistency, with 3 DCs and RF=2 in each DC, you have 6 copies of data. By definition of Quorum a majority (which is RF/2 + 1) of those 6 members needs to acknowledge the write, before claiming that the write was successful. So 4 nodes needs to respond for a quorum write here and these 4 members could be a combination of nodes from any DC. Remember the number of replicas matter to calculate quorum and not the total no. of nodes in DC.
On a side note, in Cassandra, RF=2 is as good as RF=1. To simplify, lets imagine a 3 node single DC situation. With RF=2 there are two copies of data and in order to achieve quorum ((RF=2)/2 + 1), 2 nodes needs to acknowledge the write. So both the nodes always have to be available. Even if one node fails the writes will start to fail. Event another node can take hints here, but your reads with quorum are bound to fail. So fault tolerance of node failure is equal to zero in this situation.
You could use local_quorum to speed up the writes instead of quorum. Its sacrifice of consistency over speed. Welcome to "eventually consistency".

Consistency Level Determine the number of replicas on which the write must succeed before returning an acknowledgment to the client application
Even at low consistency levels, the write is still sent to all replicas for the written key, even replicas in other data centers. The consistency level just determines how many replicas are required to respond that they received the write.
Source : http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
So If you set Consistency Level to QUORUM. I assume each DC have RF of 2. And so QUORUM is 3. So all your write still send all replicas of each DC (3 * 2 = 6 node) And will wait for 3 node to success after that it will send the acknowledgment to the client

Related

ResponseError: Not enough replicas available for query at consistency SERIAL (2 required but only 1 alive)

I am a newcomer for Cassandra, current I met an issue, my cassandra setup as following,
1 DC, 1 Cluster
3 Nodes.
SimpleStrategy
durable write : true
Replication factor : 2 when creating keyspace.
Use IF NOT EXISTS to insert data into table.
Seed node: 2 of them
Then I bring down one seed node, and I got the following error:
ResponseError: Not enough replicas available for query at consistency SERIAL (2 required but only 1 alive)
That's normal, SERIAL requires a Paxos transaction with a quorum of replicas. For RF 2, the quorum is 2; iow, you cannot tolerate any node down to write at SERIAL to a keyspace with RF 2.
Rule of thumb: don't use RF 2, it's useless. Your quorum is: (2/2)+1 = 2, but for RF 3, it's the same quorum. So you should always prefer RF 3. If you change your keyspace to RF 3, your application would be able to write at SERIAL even if one replica is down.
Also see https://www.ecyrd.com/cassandracalculator/
As per understanding Consistency serial is equivalent to QUORUM.You have RF=2 in 3 node cluster so data in Cassandra inserted based on hash. so when you have inserted the data into the cluster, data may be inserted on both seed nodes.So when you are retrieving the data with one seed node down you can get this error as cluster is not achieving the desired consistency level.
Please refer link for more details.
https://docs.datastax.com/en/ddac/doc/datastax_enterprise/dbInternals/dbIntConfigSerialConsistency.html

Insert rows only in one datacenter in cassandra cluster

For some test purposes I want to break a consistency of data in my test cassandra cluster, consisting of two datacenters.
I assumed that if I use a consistency level equal to LOCAL_QUORUM, or LOCAL_ONE I will achieve this. Let us say I have a cassandra node node11 belonging to DC1:
cqlsh node11
CONSISTENCY LOCAL_QUORUM;
INSERT INTO test.test (...) VALUES (...) ;
But in fact, data appears in all nodes. I can read it from the node22 belonging to the DC2 even with the consistency level LOCAL_*. I've double checked: the nodetool shows me the two datacenters and node11 certainly belongs to the DC1, while node22 belongs to the DC2.
My keyspace test is configured as follows:
CREATE KEYSPACE "test"
WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2};
and I have two nodes in each DC respectively.
My questions:
It seems to me that I wrongly understand the idea of these consistency level. In fact they do not prevent from writing data to the different DC's, but just ask for appearing of the data at least in the current datacenter. Is it correct understanding?
More essentially: is any way to perform such a trick and achieve such a "broken" consistency, when I have a different data stored in two datacenters within one cluster?
(At the moment I think that the only one way to achieve that - is to break the ring and do not allow nodes from one DC know anything about nodes from another DC, but I don't like this solution).
LOCAL_QUORUM, this consistency level requires a quorum of acknoledgement received from the local DC but all the data are sent to all the nodes defined in the keyspace.
Even at low consistency levels, the write is still sent to all
replicas for the written key, even replicas in other data centers. The
consistency level just determines how many replicas are required to
respond that they received the write.
https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
I don't think there is proper way to do that
This suggestion is to test scenario only to break data consistency between 2 DCs. (haven't tried but based on my understanding should work)
Write data in one DC (say DC1) with Local* consistency
Before write, keep nodes in DC2 down so DC1 will store hints as DC2 nodes are down.
Let max_hint_window_in_ms (3 hours by default - and you can reduce it) time pass so that DC1 coordinator will delete all the hints
Start DC2 nodes and query with LOCAL* query, the data from DC1 won't be present in DC2.
You can repeat these steps and insert data in DC2 with different values keeping DC1 down so same data will have different values in DC1 and DC2.

Difference between consistency level and replication factor in cassandra?

I am new to cassandra and I wanted to understand the granular difference between consistency level and replication factor.
Scenario: If I have a replication factor of 2 and consistency level of 3, how the write operation would be performed? When consistency level is set to 3, it means the results will be acknowledged to the client after writing to the 3 nodes. If data is written to 3 nodes, then it gives me a replication factor of 3 and not 2..? Are we sacrificing the replication factor in this case?
Can someone please explain where my understanding is wrong?
Thanks!
Replication factor: How many nodes should hold the data for this keyspace.
Consistency level: How many nodes needs to respond the coordinator node in order for the request to be successful.
So you can't have a consistency level higher than the replication factor simply because you can't expect more nodes to answer to a request than the amount of nodes holding the data.
Here are some references:
Understand cassandra replication factor versus consistency level
http://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureDataDistributeReplication_c.html
http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
You will get an error: Cannot achieve consistency level THREE.
You can do some further reading here
Consistency levels are of two types, write consistency and read consistency. Consistency levels can be one, two, three or quorum. If it's quorum, atleast half of the nodes should be available for the operation. Otherwise (for one, two, three), the name itself gives you the definition.
Replication factor is the number of copies that you are planning to maintain in the cluster. If the strategy is simple, you will have just one replication factor. If you are having network topology strategy and is using multi dc cluster, then you have to set replication factor for each data centre.
In your scenario, if you have RF as 2 and CL as 3, it will work(I am assuming there are more than 3 nodes in the cluster and atleast one seed node). In this scenario, it will check whether three nodes are up and normal to receive the data and if the CL is met, it will write two copies to two nodes.
For your second question
When consistency level is set to 3, it means the results will be
acknowledged to the client after writing to the 3 nodes. If data is written to 3 nodes, then it gives me a replication factor of 3 and not 2..?
As far as I understood cassandra, It will not be acknowledged to cassandra. It just needs the CL to be met and the number of nodes acknowledged about new data will be equal to the RF.
So, there is no question in sacrificing RF.

Cassandra Read/Write CONSISTENCY Level in NetworkTopologyStrategy

I have setup cassandra in 2 data centers with 4 nodes each with replication factor of 2.
Consistency level is ONE (set by default)
I was facing consistency issue when trying to read data at consistency level of ONE.
As read in DataStax documentation, Consistency level (read + write) should be greater than replication factor.
I decided to change the write consistency level to TWO and read consistency level as ONE which resolves the inconsistency problem in single data center.
But in case of multiple data center, the problem would be resolved by consistency level as LOCAL_QUORUM.
How would i achieve that write should be (LOCAL_QUORUM + TWO) so that i should write to the local data center and also on 2 nodes.
Just write using LOCAL_QUORUM in the datacenter you want. If you have a replication factor of 2 in each of your datacenter then the data you are writing in the "local" datacenter will eventually be replicated in the "other" datacenter (but you have no guaranty of when).
LOCAL_QUORUM means: "after the write operation returns, data has been effectively writen on a quorum of nodes in the local datacenter"
TWO means: "after the write operation returns, data has been writen on at least 2 nodes in any of the datacenter"
If you want to read the data you have just written with LOCAL_QUORUM in the same datacenter, you should use LOCAL_ONE consistency. If you read with ONE, then there is a chance that the closest replica is in the "remote" datacenter and therefore not yet replicated by Cassandra.
This also depends on the load balancing strategy configured at the driver level. You can read more about this here: https://datastax.github.io/java-driver/manual/load_balancing/

Cassandra Quorum : Consistency Level

I have a 3 DC ring in Cassandra with each DC having a 4 node cluster. So its 4 nodes*3(DC) = 12 nodes. I'm testing how Cassandra behaves when some nodes go down when we have Quorum consistency level. We have set a replication factor of 3 on each datacenter. So our
Quorum = Floor(Sum of Replication FActor/2) + 1. RF = 3 quorum= 5.
In theory if I have five nodes in my 12 node cluster, I should be good for read and write. So I brought down a full Datacenter DC1, and 3 nodes in another datacenter(Dc2). So I have 1 node up in DC2 and whole of DC3(4 nodes). I have 5 nodes up. By theory, this should be good for my writes to be succesfull in quorum consistency. But, when I ran, I get
Cassandra.Unavailable Exception: Not enough replica available for query at consistency ONE (5 required but only 4 alive).
But, I do have 5 nodes alive. What am I missing here ?
Quorum is for the entire cluster and Local_Quorum is for a single Data center. Just some basics to understand, cassandra is distributed systems meaning data is distributed in your cluster with each node owning a primary range and at the same time replicating data of other nodes. This means nodes which are responsible to store a piece of data are the only nodes which are calculated for Consistency. In your case 5 nodes are up does not mean Quorum consistency is met for the writes or reads, because the DC with all nodes up will definitely have data in atleast 3 nodes (remember your RF is 3), but the DC with only 1 node will either have or not have data you are querying.
In your case if you hit the DC with all node up using a Local_quorum you will get correct results.
QUORUM by itself, refers to members of same data-center.
Which in your case DC3 has of 4.
But you asked for QUORUM of 5, which DC3 cannot provide.
That is why there is concept like ONE and LOCAL_ONE.
I am pretty sure you will get same error at QUORUM 5, even if your all DC nodes are up.
You can refer : http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
From my point of view the operations should and will fail.
From the DC that is up you can guarantee at any time 3 replicas, RF 3.
The node up in the other DC has ~60% to nail down another replica.
3 + 1 = 4.
You`re asking for CL 5.
5 > 4 => fail.

Resources