MongoDB Database Semaphores and Node.js Process.NextTick() - node.js

This may be a vary bad idea, or a possible solution that we have to a database concurrency problem.
We have a method that is called to do an update of a mongo record. We are seeing some concurrency problems - process A reads the record, process B reads the record, process A makes mods and saves the record, process makes B mods and saves the record. Because B reads after A, before A writes, it doesn't know about the changes A made, and we lose the data from A.
I'm wondering if we could not use a database semaphore, basically a field on the collection, that is a boolean. If we read the record at the start of the method, and the field is true, it's being edited. At that point, re-call the method using process.nexttick(), with the same data. Otherwise, set the semaphore, and carry on.
There would still be a bit of time between the read and the save, but it should be/could be faster than what we are doing now.
Be something like this. Any thoughts, anyone done anything like this? Will it even work?
function remove_source(service_id,session, next)
{
var User = Mongoose.model("User");
/* get the user, based on the session user id */
User.findById(session.me,function(err,user_info)
{
if (user_info.semaphore === true)
{
process.nextTick(remove_source(service_id,session,next));
}
else
{
user_info.semaphore = true;
user_info.save(function(err,user_new)
{
if (err) next(err,user_new);
else continue_on(null,user_new);
});
}
function continue_on(user_new)
{
etc.......
}
Edit: New Code:
The function now looks as follows. I'm doing individual updates to the arrays. This of course means that I now have the possibility, if the transaction fails between the first and second transactions, of having data out of sync. I'm thinking that I could simply resave the user object that I retrieved on entry into the function, overwriting my changes. I don't know if Mongoose/Mongo will not do the save if I have not changed that object, will have to try and see. Any more thoughts?
var User = Mongoose.model("User");
/* get the user, based on the session user id */
User.findById(session.me,function(err,user_info)
{
if (err)
{
next(err,user_info,null);
return;
}
if (!user_info || user_info.length === 0)
{
next(_e("ACCOUNT_NOT_FOUND"),"user_id: " + session.me);
return;
}
var source_service_info = _.where(user_info.credentials, {"source_service_id": service_id});
var source_service = source_service_info.source_service;
User.findByIdAndUpdate(session.me,{$pull: {"credentials": {"source_service_id": service_id}}},{},function(err,user_credential_removed)
{
if (err)
{
next(err,user_info,null);
return;
}
User.findByIdAndUpdate(session.me,{$pull: {"criteria": {"source_service": source_service}}},{},function(err,user_criteria_removed)
{
if (err)
{
next(err,user_info,null);
return;
}
else
{
next(null,user_criteria_removed);
}
});
});
});
};

The problem with your approach is that it just shortens the time during which the data could be read by a second process, it doesn't eliminate the problem.
The solution to this would be to set your semaphore in the same action as the read. I haven't used Mongoose, but in MongoDB you can use findAndModify to only return a User record if the semaphore is false, and if it is false, in one atomic operation, set the semaphore to true.
If you don't want to use findAndModify, you could first do an update that sets the semaphore true (or to some specific ID value so you know that it is YOUR semaphore) only if the semaphore is not set. Then, if that process succeeds, you could do the find (perhaps passing your semaphore ID as a criterion in the find). However, findAndModify, if it is available in Mongoose, would do that in one step.
A variation of that is described here: http://docs.mongodb.org/manual/tutorial/isolate-sequence-of-operations/ where you do a form of optimistic locking that checks that the old values are unchanged before changing them to the new values.
There is a variation on this that uses a separate table to simulate a two-phase commit: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/

Edited: Upon interchange below, this seems to be a schema and updating issue. Question may become something like: I have some entries in an array, and the ordinal index to those entries relates to some other arrays as well. How do I perform deletes without having mismatches?
Three off the top possibilities occur, depending on frequency in the real world vs QA test scenarios.
Consider adding a deleted flag but keeping the records in the same order. If someone toggles, reuse the same record, but fix however you want.
Use an associative array (JS object) for each element (not a feature from relational world.) If you need an order, add an array that lists the keys in order. Both have syntax to update without touching anything other that what has changed, and will not overwrite changes to different fields.
Use an associative array where the keys are numbers. Actual deletion won't hurt retrieval.
stuff = {}
stuff[1] = {some:'details'}
stuff[2] = {some:'details2'}
Was
1) Are you making changes to the same field? Make that into an array, and push changes, and pop the latest to read the current value.
2) Are you changing different fields, but data is getting trounced? Then there is better syntax to use for the updating. you can update field by field.
$set: { 'fielda': 'valuea' }
won't lose edits on previous fields
3) change your schema
4) change the timing on the processes so they don't overlap. Or so they do so in smaller subsets, that you can manage to prevent from overlapping.
I'd like to know, just out of interest, what multiple processes are needed to make updates on the same record? I don't work with anything that looks like that.

Related

How to lock table with pg-promise

I have
db.result('DELETE FROM categories WHERE id = ${id}', category).then(function (data) { ...
and
db.many('SELECT * FROM categories').then(function (data) { ...
initially delete is called from one API call and then select on following API call, but callback for db request happens in reverse order, so I get list of categories with removed category.
Is there a way how to lock categories table with pg-promise?
If you want the result of the SELECT to always reflect the result of the previous DELETE, then you have two approaches to consider...
The standard approach is to unify the operations into one, so you end up executing all your dependent queries against the same connection:
db.task(function * (t) {
yield t.none('DELETE FROM categories WHERE id = ${id}', category);
return yield t.any('SELECT FROM categories');
})
.then(data => {
// data = only the categories that weren't deleted
});
You can, of course, also use either the standard promise syntax or even ES7 await/async.
The second approach would be to organize an artificial lock inside your service that would hold off on executing any corresponding SELECT until the DELETE requests are all done.
However, this is a very awkward solution, typically pointing at the flaw in the architecture. Also, as the author of pg-promise, I won't be even getting into that solution, as it would be way outside of my library anyway.

How to read/write a document in parallel execution with mongoDB/mongoose

I'm using MongoDB with NodeJS. Therefore I use mongoose.
I'm developing a multi player real time game. So I receive many requests from many players sometimes at the very same time.
I can simplify it by saying that I have a house collection, that looks like this:
{
"_id" : 1,
"items": [item1, item2, item3]
}
I have a static function, called after each request is received:
house.statics.addItem = function(id, item, callback){
var HouseModel = this;
HouseModel.findById(id, function(err, house){
if (err) throw err;
//make some calculations such as:
if (house.items.length < 4){
HouseModel.findByIdAndUpdate(id, {$push: {items: item}}, cb);
}
});
}
In this example, I coded so that the house document can never have more than 4 items. But what happens is that when I receive several request at the very same time, this function is executed twice by both requests and since it is asynchronous, they both push a new item to the items field and then my house has 5 items.
I am doing something wrong? How can I avoid that behavior in the future?
yes, you need better locking on the houseModel, to indicate that an addItem
is in progress.
The problem is that multiple requests can call findById and see the same
house.items.length, then each determine based on that (outdated) snapshot
that it is ok to add one more item. The nodejs boundary of atomicity is the
callback; between an async call and its callback, other requests can run.
One easy fix is to track not just the number of items in the house but the
number of intended addItems as well. On entry into addItem, bump the "want
to add more" count, and test that.
One possible approach since the release of Mongoose 4.10.8 is writing a plugin which makes save() fail if the document has been modified since you loaded it. A partial example is referenced in #4004:
#vkarpov15 said:
8b4870c should give you the general direction of how one would write a plugin for this
Since Mongoose 4.10.8, plugins now have access to this.$where. For documents which have been loaded from the database (i.e., are not this.isNew), the plugin can add conditions which will be evaluated by MongoDB during the update which can prevent the update from actually happening. Also, if a schema’s saveErrorIfNotFound option is enabled, the save() will return an error instead of succeeding if the document failed to save.
By writing such a plugin and changing some property (such as a version number) on every update to the document, you can implement “optimistic concurrency” (as #4004 is titled). I.e., you can write code that roughly does findOne(), do some modification logic, save(), if (ex) retry(). If all you care about is a document remaining self-consistent and ensuring that Mongoose’s validators run and your document is not highly contentious, this lets you write code that is simple (no need to use something which bypasses Mongoose’s validators like .update()) without sacrificing safety (i.e., you can reject save()s if the document was modified in the meantime and avoid overwriting committed changes).
Sorry, I do not have a code example yet nor do I know if there is a package on npm which implements this pattern as a plugin yet.
I am also building a multiplayer game and ran into the same issue. I believe I have solved it my implementing a queue-like structure:
class NpcSaveQueue {
constructor() {
this.queue = new Map();
this.runQueue();
}
addToQueue(unitId, obj) {
if (!this.queue.has(unitId)) {
this.queue.set(String(unitId), obj);
} else {
this.queue.set(String(unitId), {
...this.queue.get(unitId),
...obj,
})
}
}
emptyUnitQueue(unitId) {
this.queue.delete(unitId);
}
async executeUnitQueue(unitId) {
await NPC.findByIdAndUpdate(unitId, this.queue.get(unitId));
this.emptyUnitQueue(unitId);
}
runQueue() {
setInterval(() => {
this.queue.forEach((value, key) => {
this.executeUnitQueue(key);
})
}, 1000)
}
}
Then when I want to update an NPC, instead of interacting with Mongoose directly, I run:
npcSaveQueue.addToQueue(unit._id, {
"location.x": newLocation.x,
"location.y": newLocation.y,
});
That way, every second, the SaveQueue just executes all code for every NPC that requires updating.
This function never executes twice, because update operation is atomic on a level of single document.
More info in official manual: http://docs.mongodb.org/manual/core/write-operations-atomicity/#atomicity-and-transactions

How to loop over object & return mongoDB entry for each item?

I am having difficulties looping over an object of constituency data, finding existing entries in a MongoDB and doing something with them. It always ends up being the same entry being passed to be found in the DB over and over again.
I am assuming this is a problem of scope and timing.
My code:
for (key in jsonObj) {
var newConstituent = new Constituent({
name : jsonObj[key]["Name"],
email : jsonObj[key]["Email"],
social : {
twitter: {
twitter_handle : jsonObj[key]["Twitter handle"],
twitter_id : jsonObj[key]["User id"],
timestamp : jsonObj[key]["Timestamp"]
}
}
});
console.log(jsonObj[key]["Email"]); // this is fine here!
Constituent.findOne({ email : jsonObj[key]["Email"] }, function(err, constitutents){
console.log(jsonObj[key]["Email"]); // here it's always the same record
if (err) {
console.log(err)
}
if (constitutents === 'null') {
console.log("Constituent not found. Create new entry .. ");
// console.log(newConstituent);
newConstituent.save(function (err) {
if (err) {
console.log('db save error');
}
});
} else {
console.log("Constituent already exists .. ");
}
});
}
I have a suspicion that the for loop finishes sooner than .findOne() is executing and therefor always and only gets the last item of the object passed to find.
Could someone point me into the right direction?
A couple of this.
Don't use for ... in, especially in node. You can use Object.keys() and any of the array methods at that point. for ... in can include values you don't wish to loop over unless you're using hasOwnProperty since it'll include values from the prototype chain.
The reason the email is the same is that you're just printing out your query again. jsonObj is included in the scope of your callback to findOne since you're not re-declaring it inside the findOne callback. So whatever the value of key happens to be (my guess is that it's the last one in your list) when the callback is invoked is the email you're getting. Since, in javascript, inner function scope always includes, implicitly, the scope of the surrounding context, you're just accessing the jsonObj from your enclosing scope.
To clarify about this point, your for ... in loop is synchronous -- that is the interpreter finishes running all the instructions in it before it will process any new instructions. findOne, how ever is asynchronous. Very simply, When you call it in this loop, it's not actually doing ANYTHING immediately -- the interpreter is still running your for ... in loop. It is, however, adding more tasks to the execution stack to run after it's finished your loop. So the loop is finished, AND THEN your callbacks will start to execute. Since the for ... in loop is totally finished, key is set to whatever the final value of it was. So, for example, if it's last value was foo that means EVERYTIME your callback is invoked, you will be printing out jsonObj.foo since the for ... in loop is already complete.
So it's like you asked your friend to say the letters from A to J, and you left the room to do 10 things. To do something. He totally finished going to J since that is much faster than doing 1 of the 10 things you're doing. Now every time you're done doing one of your things, you come back and say "what's the latest letter you said". The answer will ALWAYS be J. If you need to know what letter he is on for each task you either need to get him to stop counting while you're doing it or somehow get the information about what letter corresponds with the number of task that you're performing.
Having them wait is not a good idea -- it's a waste of their time. However, if you wrap your findOne in a new function where you pass in the value of key, this would work. See the updated code below.
I'm not sure about your data but findOne will return one record. You're putting it into a variable with a plural (constitutents). From reading your code I would expect back a single value here. (It might still be wrapped in an array however.)
Since you're calling findOne and assigning the results of the find operation to constituent, you should be examining that object in the console.log.
e.g.
console.log(constitutents.email); // or console.log(constitutents[0].email)
rather than
console.log(jsonObj[key]["Email"]);
(Assuming email is a property on constituants).
You might just try logging the constituants entirely to verify what you're looking for.
The reason this following code will work is that you're passing the current value of key to the function for each invocation. This means there is a local copy of that variable created for each time you call findConstituent rather than using the closure value of the variable.
var newConstituent;
function findConstituent(key){
Constituent.findOne({ email : jsonObj[key]["Email"] }, function(err, constitutents){
console.log(jsonObj[key]["Email"]); // here it's always the same record
if (err) {
console.log(err)
}
if (constitutents === 'null') {
console.log("Constituent not found. Create new entry .. ");
// console.log(newConstituent);
newConstituent.save(function (err) {
if (err) {
console.log('db save error');
}
});
} else {
console.log("Constituent already exists .. ");
}
});
}
for (key in jsonObj) {
newConstituent = new Constituent({
name : jsonObj[key]["Name"],
email : jsonObj[key]["Email"],
social : {
twitter: {
twitter_handle : jsonObj[key]["Twitter handle"],
twitter_id : jsonObj[key]["User id"],
timestamp : jsonObj[key]["Timestamp"]
}
}
});
findConstituent(key);
}

Docuemt postopen event not operating on profile document

I need to save serial number of the document in a profile document and here is a code of action Execute Script:
if (document1.isNewNote()){
var pdoc:NotesDocument=database.getProfileDocument("LastNumber","")
var lnm=pdoc.getItemValue("lastNumber")[0];
var inputText6:com.ibm.xsp.component.xp.XspInputText = getComponent("inputText6");
inputText6.setValue(lnm);
pdoc.replaceItemValue("lastNumber",lnm);
pdoc.save();
}
This code is not opening profile document at all. Any thing wrong in the code?
"LastNumber" is the name of the form used to create Profile Document ?
this profile document already exist ?
there are no reader fields in this profile document ?
you have an error on this line : var pdoc:NotesDocument=database.getProfileDocument("LastNumber","") ?
or you have debug it and see that pdoc is null ?
instead of pdoc.getItemValue("lastNumber")[0] you can use pdoc.getItemValueInteger("lastNumber") to get a typed result
I supposed that this field contains a number and you want to increment it
instead of using inputText field you can set value directly with document1.setValue("NumberField", lnm);
I second the caution Per is suggesting. Profile documents can be a beast. You should abstract access to the "next number" into a SSJS function call. Btw. in your code snippet you don't actually increment the last number. Also: if your input text control is bound, go after the data source, not the UI.
A crude way (I would use a managed application bean for better isolation) for a better function could be this:
if(document1.isNewNote() {
document1.setValue("DocumentNumber",applicationTools.getNextNumber());
}
Then in a SSJS library you would have:
var applicationTools = {
"getNextNumber" : function() {
synchronized(applicationScope){
var pdoc:NotesDocument=database.getProfileDocument("LastNumber","");
if (!applicationScope.lastNumber) {
applicationScope.lastNumber = pdoc.getItemValueInteger("lastNumber");
}
applicationScope.lastNumber++;
pdoc.replaceItemValue("lastNumber",applicationScope.lastNumber);
pdoc.save(); //Make sure pdoc is writeable by ALL!!!!
pdoc.recycle();
return applicationScope.lastNumber;
}
},
"someOtherUtility" : function(nameToLookup, departments) {
// more stuff here
}
}
Which, in some way has been asked before, but not for a profile field. Someone still could simply go after the applicationScope.lastNumber variable, which is one of the reasons why I rather use a bean. The other: you could do the saving asynchronously, so it would be faster.
Note: in any case the number generation only works when you have a non-replicating database. But abstracting the function opens the possibility to replace fetching the number from the profile with a call to a central number generator ... or any other mechanism ... without changing your form again.

Best way to deal with document locking in xPages?

What is the best way to deal with document locking in xPages? Currently we use the standard soft locking and it seems to work fairly well in the Notes client.
In xPages I considered using the "Allow Document Locking" feature but I am worried that people would close the browser without using a close or save button then the lock would never be cleared.
Is there a way to clear the locks when the user has closed his session? I am seeing no such event.
Or is there an easier way to have document locking?
I realize I can clear the locks using an agent but when to run it? I would think sometime a night then I am fairly certain the lock should no longer really be active.
Here is code I'm using:
/* DOCUMENT LOCKING */
/*
use the global object "documentLocking" with:
.lock(doc) -> locks a document
.unlock(doc) -> unlocks a document
.isLocked(doc) -> returns true/false
.lockedBy(doc) -> returns name of lock holder
.lockedDT(doc) -> returns datetime stamp of lock
*/
function ynDocumentLocking() {
/*
a lock is an entry in the application scope
with key = "$ynlock_"+UNID
containing an array with
(0) = username of lock holder
(1) = timestamp of lock
*/
var lockMaxAge = 60 * 120; // in seconds, default 120 min
this.getUNID = function(v) {
if (!v) return null;
if (typeof v == "NotesXspDocument") return v.getDocument().getUniversalID();
if (typeof v == "string") return v;
return v.getUniversalID();
}
/* puts a lock into application scope */
this.lock = function(doc:NotesDocument) {
var a = new Array(1);
a[0] = #UserName();
a[1] = #Now();
applicationScope.put("$ynlock_"+this.getUNID(doc), a);
// print("SET LOCK "+"$ynlock_"+doc.getUniversalID()+" / "+a[0]+" / "+a[1]);
}
/* removes a lock from the application scope */
this.unlock = function(doc:NotesDocument) {
applicationScope.put("$ynlock_"+this.getUNID(doc), null);
//print("REMOVED LOCK for "+"$ynlock_"+doc.getUniversalID());
}
this.isLocked = function(doc:NotesDocument) {
try {
//print("ISLOCKED for "+"$ynlock_"+doc.getUniversalID());
// check how old the lock is
var v = applicationScope.get("$ynlock_"+this.getUNID(doc));
if (!v) {
//print("no lock found -> return false");
return false;
}
// if lock holder is the current user, treat as not locked
if (v[0] == #UserName()) {
//print("lock holder = user -> not locked");
return false;
}
var dLock:NotesDateTime = session.createDateTime(v[1]);
var dNow:NotesDateTime = session.createDateTime(#Now());
// diff is in seconds
//print("time diff="+dNow.timeDifference(dLock)+" dLock="+v[1]+" now="+#Now());
// if diff > x seconds then remove lock, it not locked
if (dNow.timeDifference(dLock) > lockMaxAge) {
// print("LOCK is older than maxAge "+lockMaxAge+" -> returning false");
return false;
}
//print("return true");
return true;
// TODO: check how old the lock is
} catch (e) {
print("ynDocumentLocking.isLocked: "+e);
}
}
this.lockedBy = function(doc:NotesDocument) {
try {
var v = applicationScope.get("$ynlock_"+this.getUNID(doc));
if (!v) return "";
//print("ISLOCKEDBY "+"$ynlock_"+doc.getUniversalID()+" = "+v[0]);
return v[0];
} catch (e) {
print("ynDocumentLocking.isLockedBy: "+e);
}
}
this.lockedDT = function(doc:NotesDocument) {
try {
var v = applicationScope.get("$ynlock_"+this.getUNID(doc));
if (!v) return "";
return v[1];
} catch (e) {
print("ynDocumentLocking.isLockedBy: "+e);
}
}
}
var documentLocking = new ynDocumentLocking();
You could take a page from the way webDAV works. There a servlet manages a "lock-list" of locked documents. The locks automatically expire after 10 minutes. Locks can be renewed or terminated trough calls. So when you edit a document you would request a lock, then kick off a CSJS timer that calls the relocking function every 8 minutes (so you have some margin for error) and the postSave calls the unlock (unless you stay in edit mode).
If a user closes the browser after 10 minutes the document is automatically unlocked. Since you are free how to implement the locking function, you can capture user/location and use that information in the "lock failed" display (you event could push that further and let the original author know about it or do some "retry" option.
It isn't simple to implement, but once implemented simple to use
ApplicationScope may be a good place to capture "locked" documents. After all, for applicationScope to expire, all users' sessions have to have expired, so anyone with the page open will not be able to save anyway.
Maybe capture UNID, user and time when someone edits a doc. Clear the value when the document is saved. Bear in mind that the user might close the browser etc. I've been discussing this approach internally and if we end up building this I would look to add it to OpenNTF. But we're unlikely to get onto it within the next month.
I Prefer to use a solution similar to Mr. Withers' answer. The main issue is how to deal with the unwanted and dreaded back button. It is easy to lock a document when it is opened, but there are many ways to close the XPage, and the user is not limited to just the navigation you provide but also can, as he stated, close the browser completely, use the back button, etc. So, the best way that I can think of is to create a few java objects which we will use in the application and session scopes.
The first step is to create a "LockedDocument" class. As we know, the documents are not serializable and we do not want to save the document itself in this object, we want to save the UNID and the time it was saved. We want to do it this way so that we can manage to clear the object after a given time (like thirty minutes to an hour). This class should also implement the comparable interface in order to sort the collection by this time so that the oldest documents are first and the newest documents are last.
Next we create another class that holds a list or a map with these LockedDocuments. This class must also have a thread (implement Runnable) that will check all documents every five minutes or so, I did not test this yet, but it should work). Any document that was locked thirty to sixty minutes ago (predefined) will be unlocked (deleted from the list). It is important that the list be sorted as described above and that the loop is "broken" when a time less than the locktime is reached in order to prevent unwanted processing.
The next step would be to include the user specific list in the sessionScope. This list is the LockedDocuments that this current user has. It is set when the user changes the document's status to editable, and is checked before the document is set to editable to prevent one document from being opened in multiple tabs by the same user. The lock is once again checked onquerysave(). Once a main page is opened, the lock is automatically released. The onquerysave() must also check to make sure the documents UNID is in the sessionScope list, or if the document is new before allowing a save.
quick recap
Any UNID saved in the applicationScope LockedDocumentList would not be editable by anyone unless it exists in their own sessionScope list.
It is possible to warn a user that their lockedTime is approaching and reset the timer.
The class containing a list with the locked documents must be a singleton
There are probably ways to improve this answer, and I am sure I am missing something. It is just a thought.
There might be a better way to handle this, but it is the best I found.
You can remove the Domino lock in window.onunload event:
window.onunload = function(){
dojo.xhrGet(...
}
No need to reinvent the wheel.

Resources