When executing a batch, if one of the operation of TableBatchOperation fails:
Every operations in the batch are canceled
Every other operations that are valid are processed
The first valid operations in the queue are processed until one operation fails, and following ones are not processed
Answer is 1 - Even if one operation fails in the batch, the entire operation fails (or in other words rolls back). This is similar to performing transactions in a relational database. What's interesting is that you get an index of the failed entity in the response when this happens. Check this thread for more details: Azure CloudTable.ExecuteBatch(TableBatchOperation) throws a storageexception. How can I find which operation(s) caused the exception?
Official blog post: http://blogs.msdn.com/b/windowsazurestorage/archive/2012/11/06/windows-azure-storage-client-library-2-0-tables-deep-dive.aspx
TableBatchOperations, or Entity Group Transactions, are executed
atomically meaning that either all operations will succeed or if there
is an error caused by one of the individual operations the entire
batch will fail.
Related
Can documentDb stored procedures run in parallel and update the same object? Will documentDb process them sequentially?
Consider the following scenario.
I have an app and I have 10000 coins to give away to my users when they complete a task. And I have the following object
{
remainingPoints: 10000
}
I have a stored procedure that subtracts 10 points from this object and adds them to the users' points.
Now lets say 10 users complete the task at the same time and I call the stored procedure 10 times at the same time, will DocDb execute them sequentially? Or will I have to execute the stored procedures sequentially?
I had similar questions when I first started using DocumentDB and got good answers here and in email from the DocumentDB product managers. Quoting:
Stored procedures ... get an isolated snapshot of the database for transactional support. The snapshot reflects the current state of the world (no stale data) at the time the sproc begins execution (strongly consistent).
Caveat – since stored procedures are operating on a snapshot, you can still get a stale read in a sproc if a new write come in from the outside world during execution.
Also, stored procedures will ALWAYS read their owns writes.
Sprocs are DocumentDB’s mechanism for multi-document transactions. Sproc writes are committed when a sproc successfully complete execution. If an exception is thrown, all work done in a sproc gets rolled back.
So if two are sprocs are running concurrently, they won’t see eachother’s writes.
If both sprocs happen to write to the same document (replace) – then the 2nd one will fail due to an etag mismatch when it attempts to commit writes.
From that, I went forward with my design making sure to use ETags in my writes as #Julian suggests. I also automatically retry up to 3 times each sproc execution to handle the case where they fail due to parallel operations among other reasons. In practice, I've never exceed the 3 retries (except in cases where my sproc had a bug) and I rarely even get a single retry.
I assume from the behavior that I observe that it sends each new sproc execution to a different replica until it runs out of replicas and then it queues them for sequential execution, so it's a hybrid of parallel and serial execution.
One other tip that I learned through experimentation is that you are better off doing pure read operations (no writes and no significant aggregation) client-side rather than in a sproc when you are on a heavily loaded system. I assume the advantage is because DocumentDB can satisfy different reads from different replicas in parallel. I have modularized my sproc code using the expandScript functionality of documentdb-utils to make sure that I use the exact same code for write validation, intra-document consistency, and derived fields both client-side and server-side, which is possible using node.js. Even if you are mostly .NET, you may want to use expandScripts to build your sprocs in a modular DRY way. You'll still need to run node.js in your build process to pre-process your sprocs or use Edge.NET (node running inside of .NET) to do so on the fly.
It will depend on the consistency you have choose for your collection. But the idea is that DocumentDb handle concurrency using etag and executes stored procedure on a snapshot of a document version, and commit the result only if the execution succeed.
See: https://azure.microsoft.com/en-us/documentation/articles/documentdb-faq/#develop
This thread may help too: Atomically increment an integer in a document in Azure DocumentDB
I am trying to execute a getRange command in fdbCli but it fails with
FDBException: Transaction is too old to perform reads or be committed
What is the meaning of this particular exception?
Does it mean by query took more than 5 sec to complete?
Fdb keeps a list of the transaction started within 5 sec. Also, data nodes only keep versions of the last 5sec. So if the read version is smaller than the last version kept by dataNodes, the dataNodes have no way to answer the request. That's why fdb throws this exception. the trick to evade from such exceptions is to split one huge time taking transaction to many small transactions. I also noticed fdb performs really well if the transaction time < 300ms.
Firstly - yes, you are correct (your query took more than 5 seconds to complete).
If the read request’s timestamp is older than 5 seconds, the storage server may have already flushed the data from its in-memory multi-version data structure to its on-disk single-version data structure. This means the storage server does not have the data older than the 5 seconds. So the client will receive the error you've mentioned.
NB: You can avoid this problem via the use of a RecordCursor and by using passing a continuation to your query.
More on continuations here.
How can atomic batches guarantee that either all statements in a single batch will be executed or none?
In order to understand how batches work under the hood, its helpful to look at the individual stages of the batch execution.
The client
Batches are supported using CQL3 or modern Cassandra client APIs. In each case you'll be able to specify a list of statements you want to execute as part of the batch, a consistency level to be used for all statements and an optional timestamp. You'll be able to batch execute INSERT, DELETE and UPDATE statements. If you choose not to provide a timestamp, the current time is automatically used and associated with the batch.
The client will have to handle two exception in case the batch could not be executed successfully.
UnavailableException - there are not enough nodes alive to fulfill any of the updates with specified batch CL
WriteTimeoutException - timeout while either writing the batchlog or applying any of the updates within the batch. This can be checked by reading the writeType value of the exception (either BATCH_LOG or BATCH).
Failed writes during the batchlog stage will be retried once automatically by the DefaultRetryPolicy in the Java driver. Batchlog creation is critical to ensure that a batch will always be completed in case the coordinator fails mid-operation. Read on for finding out why.
The coordinator
All batches send by the client will be executed by the coordinator just as with any write operation. Whats different from normal write operations is that Cassandra will also make use of a dedicated log that will contain all pending batches currently executed (called the batchlog). This log will be stored in the local system keyspace and is managed by each node individually. Each batch execution starts by creating a log entry with the complete batch on preferably two nodes other than the coordinator. After the coordinator was able to create the batchlog on the other nodes, it will start to execute the actual statements in the batch.
Each statement in the batch will be written to the replicas using the CL and timestamp of the whole batch. Beside from that, there's nothing special about writes happening at this point. Writes may also be hinted or throw a WriteTimeoutException, which can be handled by the client (see above).
After the batch has been executed, all created batchlogs can be safely removed. Therefor the coordinator will send a batchlog delete message upon successfull execution to the nodes that have received the batchlog before. This happens in the background and will go unnoticed in case it fails.
Lets wrap up what the coordinator does during batch execution:
sends batchlog to two other nodes (preferably in different racks)
execute all statements in batch
deletes batchlog from nodes again after successful batch execution
The batchlog replica nodes
As described above, the batchlog will be replicated across two other nodes (if the cluster size allows it) before batch execution. The idea is that any of these nodes will be able to pick up pending batches in case the coordinator will go down before finishing all statements in the batch.
What makes thinks a bit complicated is the fact that those nodes won't notice that the coordinator is not alive anymore. The only point at which the batchlog nodes will be updated with the current status of the batch execution, is when the coordinator is issuing a delete messages indicating the batch has been successfully executed. In case such a message doesn't arrive, the batchlog nodes will assume the batch hasn't been executed for some reasons and replay the batch from the log.
Batchlog replay is taking place potentially every minute, ie. that is the interval a node will check if there are any pending batches in the local batchlog that haven't been deleted by the -possibly killed- coordinator. To give the coordinator some time between the batchlog creation and the actual execution, a fixed grace period is used (write_request_timeout_in_ms * 2, default 4 sec). In case that the batchlog still exists after 4 sec, it will be replayed.
Just as with any write operation in Cassandra, timeouts may occur. In this case the node will fall back writing hints for the timed out operations. When timed out replicas will be up again, writes can resume from hints. This behavior doesn't seem to be effected whether hinted_handoff_enabled is enabled or not. There's also a TTL value associated with the hint which will cause the hint to be discarded after a longer period of time (smallest GCGraceSeconds for any involved CF).
Now you might be wondering if it isn't potentially dangerous to replay a batch on two nodes at the same time, which may happen has we replicate the batchlog on two nodes. Whats important to keep in mind here is that each batch execution will be idempotent due to the limited kind of supported operations (updates and deletes) and the fixed timestamp associated to the batch. There won't be any conflicts even if both nodes and the coordinator will retry executing the batch at the same time.
Atomicity guarantees
Lets get back to the atomicity aspects of "atomic batches" and review what exactly is meant with atomic (source):
"(Note that we mean “atomic” in the database sense that if any part of
the batch succeeds, all of it will. No other guarantees are implied;
in particular, there is no isolation; other clients will be able to
read the first updated rows from the batch, while others are in
progress."
So in a sense we get "all or nothing" guarantees. In most cases the coordinator will just write all the statements in the batch to the cluster. However, in case of a write timeout, we must check at which point the timeout occurred by reading the writeType value. The batch must have been written to the batchlog in order to be sure that those guarantees still apply. Also at this point other clients may also read partially executed results from the batch.
Getting back to the question, how can Cassandra guarantee that either all or no statements at all in a batch will be executed?
Atomic batches basically depend on successful replication and idempotent statements. It's not a 100% guaranteed solution as in theory there might be scenarios that will still cause inconsistencies. But for a lot of use cases in Cassandra its a very useful tool if you're aware how it works.
Batch documentation (doc) :
In Cassandra 1.2 and later, batches are atomic by default. In the context of a Cassandra batch operation, atomic means that if any of the batch succeeds, all of it will. To achieve atomicity, Cassandra first writes the serialized batch to the batchlog system table that consumes the serialized batch as blob data. When the rows in the batch have been successfully written and persisted (or hinted) the batchlog data is removed. There is a performance penalty for atomicity. If you do not want to incur this penalty, prevent Cassandra from writing to the batchlog system by using the UNLOGGED option: BEGIN UNLOGGED BATCH
Cassandra batches:-
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html
To add to above answers:-
With Cassandra 2.0, you can write batch statements + LWT. The restriction though is that all DMLs must be on same partition
I have a Quartz schedule which is inserting data in TblTransactions table. I want to run another Quartz schedule with multiple instances/threads which will fetch records from TblTransactions, do some processing and delete the records.
How do i ensure that a record fetched by a thread doesn't get fetched by another thread?
Can I integrate oracle advanced queue with hibernate? What else options can I consider?
I am using Hibernate with Oracle 11g.
It could get very tricky not to get the same record twice if multiple threads are reading the same table, even if you somehow mark them as fetched in the database (the other thread could read the row before the transaction commits).
The way I would implement this is to use a single thread to fetch the records, then split them up for processing and delegate N records to each processor-thread, and use Futures or callbacks to track the progress (so if some processor-thread fails, I know to re-submit the records for processing and/or log/email the error to alert admins so they know to check it out in case of invalid data or such).
Either the processor-threads could take care of removing the processed records themselves when they complete (either immediately after a single record has been processed, or remove all in one go after all records have been processed), or you could have a mapping in the fetch-thread to map records to processor-thread, and once the thread finishes successfully, remove all the records it processed.
If the fetching-operation would be called periodically, and there could still be old records in processing, you'd probably need the mapping in fetch-thread -side to know if the fetched records contain such records that are already in processing from an earlier fetch-run.
Azure Table Storage offers a BatchOperation method. It returns a list of TableResults. From what I've seen, there is never a time where this return value will have mixed failures and successes (as a batch should be). I haven't been able to find documentation that says this is a fact though. If anyone has a handy link to this specific info let me know.
TableBatch operation is atomic, so there is no point to continue executing the batch operation after the first failure. There are 2 outcomes for a TableBatchOperation, either all operations succeed and the overall request succeeds or the request returns on the first failed operation, and the changes made by previous operations are rolled back.
The interesting thing here is that, you will get a StorageException if there is a failure in one of the operations in the batch and the index of the failed operation is embedded inside the StorageException object. Then if you want to, you can implement the logic to automatically remove that operation from the batch (and log) and resubmit the TableBatchOperation.
I have implemented a StorageException extension class which extracts the failed operation index and many other useful information from the StorageException object.
Feel free to use it:
https://www.nuget.org/packages/AzureStorageExceptionParser/