Handling conflict in find, modify, save flow in MongoDB with Mongoose - node.js

I would like to update a document that involves reading other collection and complex modifications, so the update operators in findAndModify() cannot serve my purpose.
Here's what I have:
Collection.findById(id, function (err, doc) {
// read from other collection, validation
// modify fields in doc according to user input
// (with decent amount of logic)
doc.save(function (err, doc) {
if (err) {
return res.json(500, { message: err });
}
return res.json(200, doc);
});
}
My worry is that this flow might cause conflict if multiple clients happens to modify the same document.
It is said here that:
Operations on a single document are always atomic with MongoDB databases
I'm a bit confused about what Operations mean.
Does this means that the findById() will acquire the lock until doc is out of scope (after the response is sent), so there wouldn't be conflicts? (I don't think so)
If not, how to modify my code to support multiple clients knowing that they will modify Collection?
Will Mongoose report conflict if it occurs?
How to handle the possible conflict? Is it possible to manually lock the Collection?
I see suggestion to use Mongoose's versionKey (or timestamp) and retry for stale document
Don't use MongoDB altogether...
Thanks.
EDIT
Thanks #jibsales for the pointer, I now use Mongoose's versionKey (timestamp will also work) to avoid committing conflicts.
aaronheckmann — Mongoose v3 part 1 :: Versioning
See this sample code:
https://gist.github.com/anonymous/9dc837b1ef2831c97fe8

Operations refers to reads/writes. Bare in mind that MongoDB is not an ACID compliant data layer and if you need true ACID compliance, you're better off picking another tech. That said, you can achieve atomicity and isolation via the Two Phase Commit technique outlined in this article in the MongoDB docs. This is no small undertaking, so be prepared for some heavy lifting as you'll need to work with the native driver instead of Mongoose. Again, my ultimate suggestion is to not drink the NoSQL koolaid if you need transaction support which it sounds like you do.

When MongoDB receives a request to update a document, it will lock the database until it has completed the operation. Any other requests that MongoDB receives will wait until the locking operation has completed and the database is unlocked. This lock/wait behavior is automatic, so there aren't any conflicts to handle. You can find a lot more information about this behavior in the Concurrency section of the FAQ.
See jibsales answer for links to MongoDB's recommended technique for doing multi-document transactions.
There are a couple of NoSQL databases that do full ACID transactions, which would make your life a lot easier. FoundationDB is one such database. Data is stored as Key-Value but it supports multiple data models through layers.
Full disclosure: I'm an engineer at FoundationDB.

In my case I was wrong when "try to query the dynamic field with the upsert option". This guide helped me: How to solve error E11000 duplicate
In above guide, you're probably making one of two mistakes:
Upsert a document when findOneAndupdate() but the query finds a non-unique field.
Use insert many new documents in one go but don't use "ordered = false"

Related

NodeJS MongoDB locks on documents

I am using the mongodb driver and am concerned about possible concurrency issues that could duplicate objects. Reading a few questions and answers on stack overflows I believe that writes operations are atomic, but this may not solve my concurrency problem. Let's say there are two concurrent calls to doSomeAndDelete with the same id: operations in HERE might take some time but only one of these two functions should be able to handle result. How can I implement a lock?
async function doSomeAndDelete(id){
const result = await myCollection.findOne({ _id : id });
/*Some operations on result [HERE]*/
if(/*conditions*/)
await myCollection.deleteOne({ _id : id});
}
For deletion, only one of the operations will succeed and delete the document, while the other one will not delete anything because the document no longer exists. That, assuming, the _id will not be reused.
In general, write operations on a document are atomic, so if you have multiple threads writing to a document, you might want to use mongodb transactions, or use some form of optimistic locking. For example, you can use an ObjectId field in your documents as a version id, and use a new value for each update. When you read-and-update a document, you validate that the field has the same value you obtained from the read, meaning the record has not been modified since you read it.

Create a simple mutex with MongoDB

I just need a simple mutex, stored in MongoDB. I want a lock given a unique id. There seem to be many popular solutions with Redis, but in this case, since we are already using MongoDB, I am looking for some sort of library that I can use for locking with MongoDB, but I can't find any good packages. Is there a way to do a simple lock with Mongoose or the official MongoDB node.js driver?
I am especially looking for some mutex in MongoDB that has a built-in TTL (time to live). With Redis, you can give a key a TTL and it will remove itself after a period of time, that's an essential feature.
When I google "mongodb + ttl" this is what I see:
https://docs.mongodb.com/manual/core/index-ttl/
To recap our discussion in the comments...
DBMS Transaction Locking
If you're asking about locking at the DBMS transaction level, I think you will find that most DBMS (SQL or NoSQL) handle transactions / locking on their own (i.e. a read operation on a record will wait until a write operation is finished). In MongoDB, since each operation is a single transaction, they've provided a specifically helpful atomic operation called "findAndUpdate".
Domain Specific Locking
Nothing is stopping you from creating some sort of "locks" collection which must be checked before certain operations are made. You will definitely need to consider and take note of the "edge" cases that could result in illegal state or data inconsistency. This is a good time to also reevaluate your architecture (hint: microservices).
TTL
Mongo supports specifying a TTL index on any date field. So, in your case you could consider adding an index like so: db.my_locks.createIndex( { "deleteAt": 1 }, { expireAfterSeconds: 1 } ) and specifying "deleteAt" on insert.

In a nested MongoDB call, how do I ensure atomicity?

Is it possible to atomically update/remove two documents in MongoDB by calling a new update/remove call from within the first update's callback? In the case below, I want to remove the second document from the collection, but only if the update to the first document succeeds:
db.collection.update(conditions1, {$set: set}, function (err,result){
db.collection.remove(conditions2, function(err,doc_num){
db.close();
)};
});
I'm coming across the $isolated query operator, but from what I understand in the documentation, this operator is used for performing a read/write lock on a single query which affects multiple documents, not on performing a read/write lock on one document after performing an update on another document through the first document update's callback, which is what I want to try and accomplish.
No it's not possible because. As documented here a lock would be aquired on a single query and not a whole transaction.
You can overcome atomicity problem by using this.
As Amir said, it's not possible, but you can mimic the behavior in mongo by following the two phase commit pattern. That link also links to how to perform rollback-like operations.

Bulk operation by mongoose

I want to store bulk data (more than 1000 or 10000 records) in a single operation by MongoOSE. But MongoOSE does not support bulk operations so I will use the native driver (MongoDB, for insertion). I know that I will bypass all MongoOSE middlewares but its ok. (Please correct me If I am wrong! :) )
I have an option to store data by insert method. But MongoDB also provides Bulk class (ordered and unordered operations). Now I have the following questions:
Difference between insert and bulk operation (both can store bulk data) ?
Any specific difference between initializeUnorderedBulkOp() (performs operation in serially) and initializeOrderedBulkOp() (performs operations in parallel) ?
If I will use initializeUnorderedBulkOp then it will effect on by range search or any side-effects ?
Can I do it by Promisification (by BlueBird) ?? (I am trying to do it.)
Thanks
EDIT: I am talking about bulk vs insert regarding to multiple insertions. Which one is better? Insertion one by one by bulk builder OR insertion by batches (1000) in insert method. I hope now it will clear Mongoose (mongodb) batch insert? this link
If you are calling this from a mongoose model you need the .collection accessor
var bulk = Model.collection.initializeOrderedBulkOp();
// examples
bulk.insert({ "a": 1 });
bulk.find({ "a": 1 }).updateOne({ "$set": { "a": 2 } });
bulk.execute(function(err,result) {
// result contains stats of the operations
});
You need to be "careful" when doing this though. Apart from not being bound to the same checks and validation that can be attached to mongoose schemas, when you call .collection you need to be "sure" that the connection to the database has already been made. Mongoose methods look after this for you, but once you use the underlying driver methods you are all on your own.
As for diffferences it's all there in the naming:
Ordered: Means that the batched instructions are executed in the same order they are added. They execute one after the other in sequence and one at a time. If an error occurs at any point, the execution of the batch is halted and the error response returned. All operations up until then are "comitted". This is not a rollback.
UnOrdered: Means that batched operations can execute in "any" sequence and often in parallel. This can lead to faster updates, but of course cannot be used where one bulk operation in the batch is meant to occur before another ( example above ). Any errors that occur are merely "reported" in the result, and the whole batch will complete as sent to the server.
Of course the core difference for either type of execution from the standard methods is that the "whole batch" ( actually in lots of 1000 maximum ) is sent to the server and you only get one response back. This saves network traffic and waiting for each idividual .insert() or other like operation to complete.
As for can a "promise" be used, well anything else with a callback that you can convert to returning a promise follows the same rules as here. Remember though that the "callback/promise" is on the .execute() method, and that what you get back complies to the rules of what is returned from Bulk operations results.
For more information see "Bulk" in the core documentation.

How to account for a failed write or add process in Mongodb

So I've been trying to wrap my head around this one for weeks, but I just can't seem to figure it out. So MongoDB isn't equipped to deal with rollbacks as we typically understand them (i.e. when a client adds information to the database, like a username for example, but quits in the middle of the registration process. Now the DB is left with some "hanging" information that isn't assocaited with anything. How can MongoDb handle that? Or if no one can answer that question, maybe they can point me to a source/example that can? Thanks.
MongoDB does not support transactions, you can't perform atomic multistatement transactions to ensure consistency. You can only perform an atomic operation on a single collection at a time. When dealing with NoSQL databases you need to validate your data as much as you can, they seldom complain about something. There are some workarounds or patterns to achieve SQL like transactions. For example, in your case, you can store user's information in a temporary collection, check data validity, and store it to user's collection afterwards.
This should be straight forwards, but things get more complicated when we deal with multiple documents. In this case, you need create a designated collection for transactions. For instance,
transaction collection
{
id: ..,
state : "new_transaction",
value1 : values From document_1 before updating document_1,
value2 : values From document_2 before updating document_2
}
// update document 1
// update document 2
Ooohh!! something went wrong while updating document 1 or 2? No worries, we can still restore the old values from the transaction collection.
This pattern is known as compensation to mimic the transactional behavior of SQL.

Resources