Concurrentcy update in mongodb

Concurrentcy update in mongodb - node.js

I have used this code on node.js server side to update multiple embedded documents.
DetailerItemGroupModel.find({
"_id": itemGroupId
})
.forEach(function (doc) {
doc.Products.forEach(function (ch) {
//do something before update
});
DetailerItemGroupModel.save(doc);
});
I have the scenario like this :
Client A and B do GET http Method to get the same documents in the same time, then client B doing the update first(call to server which will run the code above) and then client A do the update.
So I want when client A's doing the UPDATE HTTP method to the server they must get the data which was latest (in this case is the document was updated by B) , I mean some how to cancel Client A request and return a bad request to tell the data Client A's going to update was changed by another client. Any way to implement that?
I read about "__v", but not sure when Client A and B send request to update same document at the same time, Does it work with forEach(), I change the code to this
DetailerItemGroupModel.find({
"_id": itemGroupId,
"__v" : {document version}
})
.forEach(function (doc) {
doc.Products.forEach(function (ch) {
//do some thing before update
});
DetailerItemGroupModel.save(doc);
});

The idea is that you want your "Save" to only work on the version that you have (which should be the latest).
Using .save will not help you here.
You can use one of two functions:
update.
findOneAndUpdate
The Idea is you have to do the find and save in one Atomic operation. And in the conditions object, you not only send the _id but you also send the __v which is the version identifier. (Something similar to the below example)
this.update({_id: document._id, __v: document.__v}, {$set: document, $inc: {__v: 1}}, result.cb(cb, function(updateResult) {
...
});
Say you read version 10, now you are ready to update it (save it). It will only work if it finds the document with the particular id and version. If in the meantime the database gets a newer version (version 11) then your update will fail.
BTW, MongoDb now has Transactions, you might want to look into that, because a transaction B would make your code wait until transaction A finishes it's atomic operation, which would be a better model than doing an update and failing and then trying again...

Related

Nodejs, MongoDB concurrent request creates duplicate record

Let me be real simple. I am running node.js server. When I receive data from patch request, first I need to check in database. If the row exists I will update it, otherwise I will create that record. Here is my code. This is what I am calling in the request handler callback.
let dbCollection = db$master.collection('users');
createUser(req.body).then(user => {
dbCollection.updateMany(
{ rmisId: req.body.rmisId },
{ $set: { ...user } },
{ upsert: true }
).then((result) => {
log.debug("RESULTS", user);
return result;
})
.catch((err => {
log.debug(err);
return err;
}));
})
This is working fine in sequential requests. But its creating duplicate record when I receive 10 concurrent request. I am running on my local machine and replicating concurrent request using Apache JMeter. Please help me if you have experienced this kind of problem.
Thank you !
UPDATE
I have tested another approach that reads the database like dbCollection.find({rmisId: req.body.rmisId}) the database for determine its existing or no. But it has no difference at all.

You cannot check-and-update. Mongodb operations are atomic at the document level. After you check and see that the record does not exist, another request may create the document you just checked, and after that you can recreate the same record if you don't have unique indexes or if you're generating the IDs.
Instead, you can use upsert, as you're already doing, but without the create. It looks like you're getting the ID from the request, so simply search using that ID, and upsert the user record. That way if some other thread inserts it before you do, you'll update what the previous thread inserted. If this is not something you prefer, add a unique index for that user ID field.

How to Determine if a Document was Actually Changed During Update in MongoDB

I am using the Mongoose driver with NodeJS. I have quite a simple update call whose purpose is to sync an external source of meetings to my database:
collection.update({ meeting_id: doc.meeting_id}, newDoc, {upsert:true})
The object returned determines whether or not an update or an insert occurred. This works perfectly. My issue is that I must determine if an actual change occurred. When you update a document with itself, MongoDB treats this in exactly the same way as if all fields were changed.
So my question is: Is there any good way to tell if anything actually changed? I could search for each document then compare each field manually, but this seems like a poor (and slow) solution.

you can use findAndModify which will return updated results as compared to update which will return no of updated records.
collection.findAndModify(
{ meeting_id: doc.meeting_id},
newDoc,
{ new: true },
function (err, documents) {
res.send({ error: err, affected: documents });
}
);

Preventing concurrent access to documents in Mongoose

My server application (using node.js, mongodb, mongoose) has a collection of documents for which it is important that two client applications cannot modify them at the same time without seeing each other's modification.
To prevent this I added a simple document versioning system: a pre-hook on the schema which checks if the version of the document is valid (i.e., not higher than the one the client last read). At first sight it works fine:
// Validate version number
UserSchema.pre("save", function(next) {
var user = this
user.constructor.findById(user._id, function(err, userCurrent) { // userCurrent is the user that is currently in the db
if (err) return next(err)
if (userCurrent == null) return next()
if(userCurrent.docVersion > user.docVersion) {
return next(new Error("document was modified by someone else"))
} else {
user.docVersion = user.docVersion + 1
return next()
}
})
})
The problem is the following:
When one User document is saved at the same time by two client applications, is it possible that these interleave between the pre-hook and the actual save operations? What I mean is the following, imagine time going from left to right and v being the version number (which is persisted by save):
App1: findById(pre)[v:1] save[v->2]
App2: findById(pre)[v:1] save[v->2]
Resulting in App1 saving something that has been modified meanwhile (by App2), and it has no way to notice that it was modified. App2's update is completely lost.
My question might boil down to: Do the Mongoose pre-hook and the save method happen in one atomic step?
If not, could you give me a suggestion on how to fix this problem so that no update ever gets lost?
Thank you!

MongoDB has findAndModify which, for a single matching document, is an atomic operation.
Mongoose has various methods that use this method, and I think that they will suit your use case:
Model.findOneAndUpdate()
Model.findByIdAndUpdate()
Model.findOneAndRemove()
Model.findByIdAndRemove()
Another solution (one that Mongoose itself uses as well for its own document versioning) is to use the Update Document if Current pattern.

How to ensure two users can atomically confirm transaction has taken place in mongodb

I have a model called a Transaction which has the following schema
var transactionSchema = new mongoose.Schema({
amount: Number,
status: String,
_recipient: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
_sender: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
});
I want both sender and recipient of this transaction to be able to 'confirm' that the transaction took place. The status starts out as "initial". So when only the sender has confirmed the transaction (but the recipient yet not), I want to update the status to "senderConfirmed" or something, and when the recipient has confirmed it (but sender has not), I want to update status to "recipientConfirmed". When they have both confirmed it, I want to update the status to "complete".
The problem is, how I can know when to update it to "complete" in a way that avoids race conditions? If both sender and recipient go to confirm the transaction at the same time, then both threads will think the status is "initial" and update it just to "senderConfirmed" or "recipientConfirmed", when in actuality it ought to go to "complete".
I read about MongoDBs two phase commit approach here but that doesn't quite fit my need, since I don't (in the case that another thread is currently modifying a transaction) want to prevent the second thread from making its update - I just want it to wait until the first thread is finished before doing its update, and then making the content of its update contingent on the latest status of the transaction.

Bottom line is you need "two" update statement to do this for each of sender and recipient respectively. So basically one is going to try and set the "partial" status to complete, and the other will only set the "initial" status match to the "partial" state.
Bulk operations are the best way to implement multiple statements, so you should use these by accessing the underlying driver methods. Modern API releases have the .bulkWrite() method, which degrades nicely if the server version does not support the "bulk" protocol, and just falls back to issuing separate updates.
// sender confirmation
Transaction.collection.bulkWrite(
[
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "recipientConfirmed"
},
"update": {
"$set": { "status": "complete" }
}
}},
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "initial"
},
"update": {
"$set": { "status": "senderConfirmed" }
}
}}
],
{ "ordered": false },
function(err,result) {
// result will confirm only 1 update at most succeeded
}
);
And of course the same applies for the _recipient except the different status check or change. You could alternately issue an $or condition on the _sender or _recipient and have a generic "partial" status instead of coding different update conditions, but the same basic "two update" process applies.
Of course again you "could" just use the regular methods and issue both updates to the sever in another way, possibly even in parallel since the conditions remain "atomic", but that is also the reason for the { "ordered": false } option since their is no determined sequence that needs to be respected here.
Bulk operations though are better than separate calls, since the send and return is only one request and response, as opposed to "two" of each, so the overhead using bulk operations is far less.
But that is the general approach. No single statement could possibly leave a "status" in "deadlock" or mark as "complete" before the other party also issues their confirmation.
There is a "possibility" and a very slim one that a status was changed from "initial" in between the first attempt update and the second, which would result in nothing being updated. In that case, you can "retry" the action on which it "should" update on the subsequent attempt.
This should only ever need "one" retry at most though. And very very rarely.
NOTE: Care should be taken when using the .collection accessor on Mongoose models. All the regular model methods have built in logic to "ensure" the connection to the database is actually present before they do anything, and in fact "queue" operations until a connection is present.
It's generally good practice to wrap your application startup in an event handler to ensure the database connection:
mongoose.on("open",function() {
// App startup and init here
})
So using the "on" or "once" events for this case.
Generally though a connection is always present either after this event is fired, or after any "regular" model method has already been called in the application.
Possibly mongoose will include methods like .bulkWrite() directly on the model methods in future releases. But presently it does not, so the .collection accessor is necessary to grab the underlying Collection object from the core driver.

Update: I am clarifying my answer based on a comment that my original response did not provide an answer.
An alternative approach would be to keep track of the status as two separate properties:
senderConfirmed: true/false,
recipientConfirmed: true/false,
When the sender confirms you simply update the senderConfirmed field. When the recipient confirms you update the recipientConfirmed field. There is no way they will overwrite each other.
To determine if the transaction is complete you would merely query {senderConfirmed:true,recipientConfirmed:true}.
Obviously this is a change to the document schema, so it may not be ideal.
Original Answer:
Is a change to your schema possible? What if you had two properties - senderStatus and recipientStatus? Sender would only update senderStatus and recipient would only update recipientStatus. Then they couldn't overwrite each others changes.
You would still need some other way to mark it as complete, I assume. You could us a cron job or something...

Updating and Deleting documents using NodeJS on Cloudant DB

I was able to successfully query(select) and insert data into Cloudant database using HTTP /REST API. But I am not able to figure out - how to delete and modify documents.
For Delete: I tried the following code in nodejs
path : '/quick_loan_nosql_db_1?951b05d1b6aa100f4b94e5185674ef40/_rev=1-88963c755157188091f0d30115de18ce'
part of the REST API Request with METHOD: DELETE.
But when I execute it deletes the entire database instead of the ID being specified.
For Update: Can some one provide a sample, I tried with PUT, but in response I got a Conflict data error.
Any input would be appreciated.

Nice! To answer your original question, you just have the "/" and the "?" in the wrong places. To recap:
/quick_loan_nosql_db_1?951b05d1b6aa100f4b94e5185674ef40/_rev=1-88963c755157188091f0d30115de18ce
should instead be:
/quick_loan_nosql_db_1/951b05d1b6aa100f4b94e5185674ef40?_rev=1-88963c755157188091f0d30115de18ce

Here is one way I figured out to perform the Update and Delete.
I used nano api:
Include nano by
var nano =require('nano')('https://'+dbCredentials.user+':'+dbCredentials.password+'hostname:port/');
Please make sure to put the right user id and password
For Update
Update - you need to use insert api only, but with the right _id and _eval and the changes. For example:
nanodb.insert({ "_id": "3a1cc8c7f955f895131c3289f5144eab", "_rev": "2- 7040205a1c9270ad696724458399ac94", "name": "Tom", "employername":"Google"}, function(err, body, header) {
if (err) {
console.log('[db.insert] ', err);
return;
}
console.log('you have inserted the rabbit.')
console.log(body);
});
The above code will perform an update on the given id and the _rev. There will be a new revision number updated and the id will remain same. If you miss the ID or revision number, it will throw a conflict error.
For Delete
Simple use nano.destroy with the id and the revision number
nanodb.destroy("3a1cc8c7f955f895131c3289f5144eab","3-3e39e2298f109414cef1310449e0fd5c",function(err, body, header) {
if (err) {
console.log('[db.insert] ', err);
return;
}
console.log('you have inserted the rabbit.')
console.log(body);
});
Using Nano like framework API is better than making REST API calls over http for accessing the cloudant Database.
Hope this helps for people who want to connect to Cloudant db from NodeJS

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string