I have one node with replication factor 1 and fire a batch statement query on that node ,cassandra writes the data but failed to acknowledge with in timeout limit . then it gives a write timout exception with following stacktrace .
failed `Exception in thread "main" com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:271)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:187)
at com.datastax.driver.core.Session.execute(Session.java:126)
at jason.Stats.analyseLogMessages(Stats.java:91)
at jason.Stats.main(Stats.java:48)
then if you go back and check the table then you will find data has been written . So my question is , if cassandra gives write timout exception then it should rollback the changes .
I mean i don't want to write to database if i am getting write timout exception ,is there any rollback strategy present for that particular scenario .
Based on your description what you are expecting is that Cassandra supports ACID compliant transaction at least with regards to the A - Atomicity. Cassandra does not provide ACID-compliant transactions instead it relies on eventual consistency to provide a durable data store. Cassandra does provide Atomicity in as much as a single partition on a node is atomic by which I mean an entire row will either be written or not. However a write can still succeed on one or more replicas but after the timeout set by your client. In this case the client would receive an error but the data would be written. There is nothing that will rollback that transaction. Instead the data in the cluster will become consistent using the normal repair mechanisms.
My suggestion for you would be to:
In the case of a timeout do a retry of the write query
Investigate why you are getting a timeout error on a write with a CL=ONE. If this is a multi-DC sort of setup have you tried CL=LOCAL_ONE.
Some docs to read:
https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_atomicity_c.html
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesReadRepair.html
Cassandra does not have any notion of rollbacks. If a write times out that means that the write may have succeeded or may not have. This is why C* tries to focus users on idempotent data models and structures.
The only means of actually performing some kind of conditional write is via Light Weight Transactions which allow for some check and set operations.
Related
I didn’t get from the documentation what happens after a node fails during writes.
I got the idea of quorum but what happens after “the write transaction” fails?
For example:
I inserted the record and chose the level of consistency equal to QUORUM.
Assume QUORUM = 3 nodes and 2 of 3, or just 1 of 3 nodes wrote the date but the rest didn’t and failed.
I got an error.
What happens with the record on the nodes which wrote it?
How can Casandra prevent propagating this row to other nodes through a replica synchronization?
Or if I get errors on writing it actually means that this row could appear within some time on each replica?
Cassandra doesn't have transactions (except light-weight transactions, that are also different kind of thing). When some node received & written the data, and other not - there is no rollback or something like this. This data is written. But coordinator node sees that consistency level couldn't be reached, and report error back to client application saying about it, so it could be retried if necessary. If it's not retried, then data could be propagated through the repair operations - either read repair, or through explicit repair. But because the data is on a single node, this means that this node mail fail before repair happens, and data could be lost.
I'm not able to understand the scenario where during the write process, the desired write consistency level cannot be met. For e.g. suppose I have 3 nodes, 2 in one data center(dc1) and the remaining one in the other data center(dc2). Network Topology Strategy. Now if I'm writing with consistency level three and one of the node is down, what exactly will happen?
Since 2 nodes are up, they will be able to complete the write process, however since the consistency level cannot be met, therefore the coordinator node will return a write error to the client.
What will happen to the data written in the 2 nodes? The client will not be expecting any data in any node because he received a write error.
There is no rollback in Cassandra, then how does Cassandra remove failed writes?
According to the above link, Cassandra does not rollback writes.
Does Cassandra write to a node(which is up) even if Consistency cannot be met?
The accepted answer in the above link states that "On the nodes that the write succeeded, the data is actually written and it is going to be eventually rolled back."
If the coordinator cannot write to enough replicas to meet the
requested consistency level, it throws an Unavailable Exception and
does not perform any writes.
If coordinator doesn't know about replica failure before hand i.e replica failed during write then coordinator will throw timeout exception and client will have to handle it. (Retry policies)
Cassandra Write Request
The questions are regarding the “CAS operations” paragraph into the article : http://www.datastax.com/dev/blog/cassandra-error-handling-done-right
a)
If the paxos phase fails, the driver will throw a WriteTimeoutException with a WriteType.CAS as retrieved with WriteTimeoutException#getWriteType(). In this situation you can’t know if the CAS operation has been applied..
How do you understand this?
I thought that If the paxos (prepare) phase fails then the coordinator will not initiate the commit phase at all?
I guess that it does not matter how the paxos phase fails (not enough replicas or replica timeouts or ..).
b)
The commit phase is then similar to regular Cassandra writes… you can simply ignore this error if you make sure to use setConsistencyLevel(ConsistencyLevel.SERIAL) on the subsequent read statements on the column that was touched by this transaction, as it will force Cassandra to commit any remaining uncommitted Paxos state before proceeding with the read
Wondering about the above with relation to writes with ConsistencyLevel.QUORUM:
If the commit phase failed because there is no quorum (unavailable nodes or timeouts) then we get back WriteTimeoutException with a WriteType of SIMPLE, right?
In this case it is not clear if the write is actually successful or not, right?
So I’m not sure what are all the possibilities from now on (recover/rollback/nothing)?
Is it saying that if I use ConsistencyLevel.QUORUM for the read operation I can see the old data version (as if the above write was not successful) for some time and after that again with QUORUM read I will see that the write is successful?
(actually I’m seen exactly this in a 3 node cluster with replication factor=3 after WriteTimeoutException (2 replica were required but only 1 acknowledged the write) – quorum read just after that returned the old data and then when i check with cqlsh I see the new data).
How this is possible?
guess:
Probably after the timeout the coordinator says that we have no quorum for the commit phase yet (and subsequent QUORUM reads get the older data version) and returns the WriteTimeoutException.type=SIMPLE to the client. And when the nodes that have timeout actually respond/commit we have a quorum in this future moment and after it all quorum reads will obtain the newer data version.
But not sure about the explanation of when you use read with SERIAL.
The below statement from Cassandra documentation is the reason for my doubt.
For example, if using a write consistency level of QUORUM with a replication factor of 3, Cassandra will replicate the write to all nodes in the cluster and wait for acknowledgement from two nodes. If the write fails on one of the nodes but succeeds on the other, Cassandra reports a failure to replicate the write on that node. However, the replicated write that succeeds on the other node is not automatically rolled back.
Ref : http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_atomicity_c.html
So does Cassandra write to a node(which is up) even if Consistency cannot be met ?
I got it. Cassandra will not even attempt to write if it knows that consistency cannot be met. If consistency CAN be met, but does not have enough replicas to satisfy replication factor, then Cassandra would write to currently available replicas and gives a success message. Later when the replica is up again, it will write to other replica.
For e.g. If Replication factor is 3 , 1 of 3 nodes are down, then if I write with a Consistency of 2, the write will succeed. But if Replication factor is 2 and 1 of 2 nodes are down , then if I write with a Consistency of 2, Cassandra will not even write to that single node which is available.
What is mentioned in the documentation is a case where while write was initiated when the consistency can be met. But in between, one node went down and couldn't complete the write, whereas write succeeded in other node. Since consistency cannot be met, client would get a failure message. The record which was written to a single node would be removed later during node repair or compaction.
Consistency in Cassandra can (is?) be defined at statement level. That means you specify on a particular query, what level of consistency you need.
This will imply that if the consistency level is not met, the statement above has not met consistency requirements.
There is no rollback in Cassandra. What you have in Cassandra is Eventual consistency. That means your statement might be a success in future if not immediately. When a replica node comes a live, the cluster (aka the Cassandra's fault tolerance) will take care of writing to the replica node.
So, if your statement is failed, it might be succeeded in future. This is in contrary to the RDBMS world, where an uncommitted transaction is rolled back as if nothing has happened.
Update:
I stand corrected. Thanks Arun.
From:
http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html
During a write operation, when hinted handoff is enabled and consistency can be met, the coordinator stores a hint about dead replicas in the local system.hints table under either of these conditions:
So it's still not rollback. Nodes know the current cluster state and doesn't initiate the write if consistency cannot be met.
At driver level, you get an exception.
On the nodes that the write succeeded, the data is actually written and it is going to be eventually rolled back.
In a normal situation, you can consider that the data was not written to any of the nodes.
From the documentation:
If the write fails on one of the nodes but succeeds on the other,
Cassandra reports a failure to replicate the write on that node.
However, the replicated write that succeeds on the other node is not
automatically rolled back.
The Cassandra 2.0 documentation contains the following paragraph on Atomicity:
For example, if using a write consistency level of QUORUM with a replication factor of 3, Cassandra will replicate the write to all nodes in the cluster and wait for acknowledgement from two nodes. If the write fails on one of the nodes but succeeds on the other, Cassandra reports a failure to replicate the write on that node. However, the replicated write that succeeds on the other node is not automatically rolled back.
So, write requests are sent to 3 nodes, and we're waiting for 2 ACKs. Let's assume we only receive 1 ACK (before timeout). So it's clear, that if we read with consistency ONE, that we may read the value, ok.
But which of the following statements is also true:
It may occur, that the write has been persisted on a second node, but the node's ACK got lost? (Note: This could result in a read of the value even at read consistency QUORUM!)
It may occur, that the write will be persisted later to a second node (e.g. due to hinted handoff)? (Note: This could result in a read of the value even at read consistency QUORUM!)
It's impossible, that the write is persisted on a second node, and the written value will eventually be removed from the node via ReadRepair?
It's impossible, that the write is persisted on a second node, but it is necessary to perform a manual "undo" action?
I believe you are mixing atomicity and consistency. Atomicity is not guaranteed across nodes whereas consistency is. Only writes to a single row in a single node are atomic in the truest sense of atomicity.
The only time Cassandra will fail a write is when too few replicas are alive when the coordinator receives the request i.e it cannot meet the consistency level. Otherwise your second statement is correct. It will hint that the failed node (replica) will need to have this row replicated.
This article describes the different failure conditions.
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure