Cassandra result from running query cqlsh from one of the node cluster - cassandra

I am new to cassandra and haven't finished read the doc yet.
Just want to know if I run cqlsh from one node of a 3 nodes cluster and run query
e.g.
cqlsh node1host -u username -p passwd -k my_cass_keyspace
> select ...
is the result come from all the 3 nodes or it is just result from the node that I run cqsh in?
Sorry for very noob question.
Thanks.

General answer : from all node.
Detailed answer :
Your node might be the coordinator, then depending on your replication factor, the node might fetch data from other nodes ( for example RF 1, then you query on a partition from another node).
This depends on your replication factor, and also your consistency level.
you can check your consistency when using cqlsh by (default is ONE) :
cqlsh> consistency;
Current consistency level is ONE.
You can change it by ( to QUORUM for example) :
cqlsh> CONSISTENCY QUORUM ;
If you want to know details about your request execution plan, try to activate tracing :
cqlsh> tracing on ;
I hope this helps !

The answer is Yes, it comes from all the nodes but it depends upon you cluster configurations, replication factor and consistency level.
For Example:- You have 3 nodes cluster and replication factor is 3 and consistency level is quorum for read and write both so whenever you will do insert query then your data will replicate to all 3 nodes but 2 nodes acknowledgement is sufficient to coordinator as quorum. same fill follow if you do a select query.
You may also refer Cassandra documentation as below:-
http://cassandra.apache.org/doc/latest/architecture/dynamo.html#replication

Related

ResponseError: Not enough replicas available for query at consistency SERIAL (2 required but only 1 alive)

I am a newcomer for Cassandra, current I met an issue, my cassandra setup as following,
1 DC, 1 Cluster
3 Nodes.
SimpleStrategy
durable write : true
Replication factor : 2 when creating keyspace.
Use IF NOT EXISTS to insert data into table.
Seed node: 2 of them
Then I bring down one seed node, and I got the following error:
ResponseError: Not enough replicas available for query at consistency SERIAL (2 required but only 1 alive)
That's normal, SERIAL requires a Paxos transaction with a quorum of replicas. For RF 2, the quorum is 2; iow, you cannot tolerate any node down to write at SERIAL to a keyspace with RF 2.
Rule of thumb: don't use RF 2, it's useless. Your quorum is: (2/2)+1 = 2, but for RF 3, it's the same quorum. So you should always prefer RF 3. If you change your keyspace to RF 3, your application would be able to write at SERIAL even if one replica is down.
Also see https://www.ecyrd.com/cassandracalculator/
As per understanding Consistency serial is equivalent to QUORUM.You have RF=2 in 3 node cluster so data in Cassandra inserted based on hash. so when you have inserted the data into the cluster, data may be inserted on both seed nodes.So when you are retrieving the data with one seed node down you can get this error as cluster is not achieving the desired consistency level.
Please refer link for more details.
https://docs.datastax.com/en/ddac/doc/datastax_enterprise/dbInternals/dbIntConfigSerialConsistency.html

Cassandra clustering fail over-High Avialability

I have configured a cassandra clustter with 3 nodes
Node1(192.168.0.2) , Node2(192.168.0.3), Node3(192.168.0.4)
Created a keyspace 'test' with replication factor as 2.
Create KEYSPACE test WITH replication = {'class':'SimpleStrategy',
'replication_factor' : 2}
When I stop either Node2 or Node3 (one at a time and both at one time) , I am able to do the CRUD operations on the keyspace.table.
When I stop Node1 and try to update/create a row from Node4 or Node3, getting following error although Node3 and Node4 are up and running-:
All host(s) tried for query failed (tried: /192.168.0.4:9042
(com.datastax.driver.core.exceptions.DriverException: Timeout while
trying to acquire available connection (you may want to increase the
driver number of per-host connections)))
com.datastax.driver.core.exceptions.NoHostAvailableException: All
host(s) tried for query failed (tried: /192.168.0.4:9042
(com.datastax.driver.core.exceptions.DriverException: Timeout while
trying to acquire available connection (you may want to increase the
driver number of per-host connections)))
I am not sure how Cassandra elects a leader if a leader node dies.
So, you are using replication_factor 2, so only 2 nodes will have a replica of you keyspace (not all the 3 nodes).
My first advise is to change the RF to 3.
You have to pay attention to the consistency level you are using; If you have only 2 copies of you data (RF: 2), and you are using Consistency Level QUORUM, it will try to write the data on half of nodes + 1, in this case, all 2 nodes. So if 1 node is down, you will not be able to write/read data.
to verify where the data is replicated you could see how is the ring in you cluster. As you are using SimpleStrategy it will copy the data clockwise direction. And in your case its copied at nodes at 192.168.0.2 and 192.168.0.3.
Take a look at the concepts of replication factor: http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/architecture/architectureDataDistributeReplication_c.html
And Consistency Level: http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
Great answer about RF vs CL: https://stackoverflow.com/a/24590299/6826860
You can use this calculator to find out if your setup have a decent consistency. In your case the result is You can survive the loss of no nodes without impacting the application
I think I wasn't clear at response. The replication factor is about how many copies of your data will exists. The consistency level is how many copies your client will wait to be made before get an response from server.
Ex: All your nodes are up. The client make a CQL with CL Quorum, the server will copy the data in 2 nodes (3/2 + 1) and reply to client, in background it will copy the data at the third node as well.
In your example, if you shutdown 2 nodes of a 3 node cluster you will never achieve an QUORUM to make requests (with CL QUORUM), so you have to use consistency level ONE, once the nodes are up again, cassandra will copy the data on them. One thing that can happen is: before cassandra copy the data on other 2 nodes, the client make a request for node1 or node2 and the data is not there yet.

Consistency level in cassandra issue

Environment :
5 machines Cassandra 2.1.15 cluster.
RF = 3, CL = QUORUM
1 machine goes down for more than 3 hours, without the possibility to bring it back
Decide to do noderemove and replace it :
The problem i saw is this :
Did heavy load over the node :
cassandra-stress write n=50000000 cl=QUORUM -rate threads=1000 -node 192.168.0.171,192.168.0.177,192.168.0.178,192.168.0.179,192.168.0.220
At one time gave me the error :
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency QUORUM (3 replica were required but only 2 acknowledged the write)
According to my knowledge QUORUM = RF/2+1 rounded down => 2 replicas should be acquired.
Is this some kind of a bug!? Does it have some kind of negative impact?
Are you certain that cassandra-stress is using your keyspace? If you have not configured it to do so, it must be using default keyspace with replications as many as number of nodes. Try using -schema switch for cassandra-stress.
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html

Cassandra cluster with each node total replication

Hi I'm new to Cassandra. I have a 2 node Cassandra cluster. For reasons imposed by the front end I need...
Total replication of all data on each of the two nodes.
Eventual consistent writes. So the node being written to will respond with an acknowledge to the front end straight away. Not synchronized on the replication
Can anyone tell me is this possible? Is it done in the YAML file? I know there is properties there for consistency but I don't see that any of the Partitioners suit my needs. Where can I set the replication factor?
Thanks
You set the replication factor during creation of the keyspace. So if you use (and plan for the future on using) a single data center set-up, you create the keyspace using cqlsh like so
CREATE KEYSPACE "Excalibur"
WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 3};
Check out the documentation regarding the create keyspace. How this is handled internally is related to the snitch definition of the cluster and a strategy option defined per keyspace. In the case of the SimpleStrategy above, this simply assumes a ring topology of your cluster and places the data clockwise in that ring (see this).
Regarding consistency, you can very different levels of consistency for write and read operations in your client/driver during each operation:
Cassandra extends the concept of eventual consistency by offering tunable consistency―for any given read or write operation, the client application decides how consistent the requested data should be.
Read the doc
If you use Java in your clients, and the DatatStax Java driver, you can set the consistency level using
QueryOptions.setConsistencyLevel(ConsistencyLevel consistencyLevel)
"One" is the default setting.
Hope that helps

Cassandra nodes ownership is 0.00%

I have a Cassandra cluster with 2 nodes. I am using NetworkTopologyStrategy
I was trying to increase the replication factor of keyspace in Cassandra to 2. I did the following steps:
UPDATE KEYSPACE demo WITH strategy_options = {DC1:2,DC2:2}; on both the nodes
Then I ran the nodetool repair on both the nodes
Then I ran my Hector code to count the number of rows and columns in the database.
I get the following error: Unavailable Exception
Also when I run the command
./nodetool –h ip_address ring
I found that both nodes ownership is 0 %. Please tell me how should I fix that.
You mention "both nodes", which implies that you have two total nodes rather than two data centers as would be suggested by your strategy options. Specifying {DC1:2,DC2:2} would require a minimum of four nodes (two in each DC to satisfy the replication factor), although this would not be advised since essentially all your nodes would be points of failure.
A minimal Cassandra cluster should have at least three nodes, in which case a RF of two would allow one node to go down without bringing down the system. It sounds like you have a single cluster (rather than two data centers), so what you really need is one more node (3 total), RF=2, using the SimpleStrategy instead of NetworkTopologyStrategy.

Resources