Mongodb, incrementing value inside an array. (save() ? update() ?) - node.js

var Poll = mongoose.model('Poll', {
title: String,
votes: {
type: Array,
'default' : []
}
});
I have the above schema for my simple poll, and I am uncertain of the best method to change the value of the elements in my votes array.
app.put('/api/polls/:poll_id', function(req, res){
Poll.findById(req.params.poll_id, function(err, poll){
// I see the official website of mongodb use something like
// db.collection.update()
// but that doesn't apply here right? I have direct access to the "poll" object here.
Can I do something like
poll.votes[1] = poll.votes[1] + 1;
poll.save() ?
Helps much appreciated.
});
});

You can to the code as you have above, but of course this involves "retrieving" the document from the server, then making the modification and saving it back.
If you have a lot of concurrent operations doing this, then your results are not going to be consistent, as there is a high potential for "overwriting" the work of another operation that is trying to modify the same content. So your increments can go out of "sync" here.
A better approach is to use the standard .update() type of operations. These will make a single request to the server and modify the document. Even returning the modified document as would be the case with .findByIdAndUpdate():
Poll.findByIdAndUpdate(req.params.poll_id,
{ "$inc": { "votes.1": 1 } },
function(err,doc) {
}
);
So the $inc update operator does the work of modifying the array at the specified position using "dot notation". The operation is atomic, so no other operation can modify at the same time and if there was something issued just before then the result would be correctly incremented by that operation and then also by this one, returning the correct data in the result document.

Related

Watch MongoDB to return changes along with a specified field value instead of returning fullDocument

I'm using watch() function of mongo to listen to changes made to a replicaSet, now I know I can get the whole document (fullDocument) by passing { fullDocument: 'updateLookup' } to the watch method,
like :-
someModel.watch({ fullDocument: 'updateLookup' })
But what I really want to do is, get just one extra field which isn't changed every time a new update is made.
Let's say a field called 'user_id', currently I only get the updatedFields and the fullDocument which contains the 'user_id' along with a lot of other data which I would like to avoid.
What I have researched so far is Aggregation pipeline but couldn't figure out a way to implement it.
Can anybody help me figure out a way to this?
Thanks everyone for suggesting, as #D.SM pointed out I successfully implemented $project
Like this :-
const filter = [{"$match":{"operationType":"update"}}, {"$project":{"fullDocument.user_id": 1, "fullDocument.chats": 0, "fullDocument._id": 0, "fullDocument.first_name": 0, "fullDocument.last_name": 0 }}];
Then passed it to watch() method
Like:-
const userDBChange = userChatModel.watch(filter, { fullDocument: 'updateLookup' });
Now I'm only getting user_id inside fullDocument object when the operationType is update hence reducing the data overhead returned from mongo
Thanks again #D.SM and other's for trying to help me out ;)

Upsert and $inc Sub-document in Array

The following schema is intended to record total views and views for a very specific day only.
const usersSchema = new Schema({
totalProductsViews: {type: Number, default: 0},
productsViewsStatistics: [{
day: {type: String, default: new Date().toISOString().slice(0, 10), unique: true},
count: {type: Number, default: 0}
}],
});
So today views will be stored in another subdocument different from yesterday. To implement this I tried to use upsert so as subdocument will be created each day when product is viewed and counts will be incremented and recorded based on a particular day. I tried to use the following function but seems not to work the way I intended.
usersSchema.statics.increaseProductsViews = async function (id) {
//Based on day only.
const todayDate = new Date().toISOString().slice(0, 10);
const result = await this.findByIdAndUpdate(id, {
$inc: {
totalProductsViews: 1,
'productsViewsStatistics.$[sub].count': 1
},
},
{
upsert: true,
arrayFilters: [{'sub.day': todayDate}],
new: true
});
console.log(result);
return result;
};
What do I miss to get the functionality I want? Any help will be appreciated.
What you are trying to do here actually requires you to understand some concepts you may not have grasped yet. The two primary ones being:
You cannot use any positional update as part of an upsert since it requires data to be present
Adding items into arrays mixed with "upsert" is generally a problem that you cannot do in a single statement.
It's a little unclear if "upsert" is your actual intention anyway or if you just presumed that was what you had to add in order to get your statement to work. It does complicate things if that is your intent, even if it's unlikely give the finByIdAndUpdate() usage which would imply you were actually expecting the "document" to be always present.
At any rate, it's clear you actually expect to "Update the array element when found, OR insert a new array element where not found". This is actually a two write process, and three when you consider the "upsert" case as well.
For this, you actually need to invoke the statements via bulkWrite():
usersSchema.statics.increaseProductsViews = async function (_id) {
//Based on day only.
const todayDate = new Date().toISOString().slice(0, 10);
await this.bulkWrite([
// Try to match an existing element and update it ( do NOT upsert )
{
"updateOne": {
"filter": { _id, "productViewStatistics.day": todayDate },
"update": {
"$inc": {
"totalProductsViews": 1,
"productViewStatistics.$.count": 1
}
}
}
},
// Try to $push where the element is not there but document is - ( do NOT upsert )
{
"updateOne": {
"filter": { _id, "productViewStatistics.day": { "$ne": todayDate } },
"update": {
"$inc": { "totalProductViews": 1 },
"$push": { "productViewStatistics": { "day": todayDate, "count": 1 } }
}
}
},
// Finally attempt upsert where the "document" was not there at all,
// only if you actually mean it - so optional
{
"updateOne": {
"filter": { _id },
"update": {
"$setOnInsert": {
"totalProductViews": 1,
"productViewStatistics": [{ "day": todayDate, "count": 1 }]
}
}
}
])
// return the modified document if you really must
return this.findById(_id); // Not atomic, but the lesser of all evils
}
So there's a real good reason here why the positional filtered [<identifier>] operator does not apply here. The main good reason is the intended purpose is to update multiple matching array elements, and you only ever want to update one. This actually has a specific operator in the positional $ operator which does exactly that. It's condition however must be included within the query predicate ( "filter" property in UpdateOne statements ) just as demonstrated in the first two statements of the bulkWrite() above.
So the main problems with using positional filtered [<identifier>] are that just as the first two statements show, you cannot actually alternate between the $inc or $push as would depend on if the document actually contained an array entry for the day. All that will happen is at best no update will be applied when the current day is not matched by the expression in arrayFilters.
The at worst case is an actual "upsert" will throw an error due to MongoDB not being able to decipher the "path name" from the statement, and of course you simply cannot $inc something that does not exist as a "new" array element. That needs a $push.
That leaves you with the mechanic that you also cannot do both the $inc and $push within a single statement. MongoDB will error that you are attempting to "modify the same path" as an illegal operation. Much the same applies to $setOnInsert since whilst that operator only applies to "upsert" operations, it does not preclude the other operations from happening.
Thus the logical steps fall back to what the comments in the code also describe:
Attempt to match where the document contains an existing array element, then update that element. Using $inc in this case
Attempt to match where the document exists but the array element is not present and then $push a new element for the given day with the default count, updating other elements appropriately
IF you actually did intend to upsert documents ( not array elements, because that's the above steps ) then finally actually attempt an upsert creating new properties including a new array.
Finally there is the issue of the bulkWrite(). Whilst this is a single request to the server with a single response, it still is effectively three ( or two if that's all you need ) operations. There is no way around that and it is better than issuing chained separate requests using findByIdAndUpdate() or even updateOne().
Of course the main operational difference from the perspective of code you attempted to implement is that method does not return the modified document. There is no way to get a "document response" from any "Bulk" operation at all.
As such the actual "bulk" process will only ever modify a document with one of the three statements submitted based on the presented logic and most importantly the order of those statements, which is important. But if you actually wanted to "return the document" after modification then the only way to do that is with a separate request to fetch the document.
The only caveat here is that there is the small possibility that other modifications could have occurred to the document other than the "array upsert" since the read and update are separated. There really is no way around that, without possibly "chaining" three separate requests to the server and then deciding which "response document" actually applied the update you wanted to achieve.
So with that context it's generally considered the lesser of evils to do the read separately. It's not ideal, but it's the best option available from a bad bunch.
As a final note, I would strongly suggest actually storing the the day property as a BSON Date instead of as a string. It actually takes less bytes to store and is far more useful in that form. As such the following constructor is probably the clearest and least hacky:
const todayDate = new Date(new Date().setUTCHours(0,0,0,0))

How to validate array length when using $push?

I'm trying to limit the amount of elements a user can add to an array field on one of my schemas. I'm currently adding the elements to the array using Schema.findOneAndUpdate(); with the $push operator.
The first thing I tried was the solution given by another answer here on StackOverflow, namely: https://stackoverflow.com/a/29418656/6502807
This solution adds a validate function to the fields in the schema definition. By setting runValidators to true, I did get the function to run with Schema.findOneAndUpdate(). It was at that moment, however, that I stumbled upon the next problem. At the end of the Validation chapter in the Mongoose docs it says:
Also, $push, $addToSet, $pull, and $pullAll validation does not run any validation on the array itself, only individual elements of the array.
So attempting to check for array length did not work when using $pull. It simply supplied the validation function with an empty array every time, regardless of its actual contents in the database.
Next thing I tried was to use a pre hook. This was without any success as well. For some reason it did not execute the hook, even with runValidators set to true. This is how I defined said hook:
Settings.pre('update', async function (next) {
if (this.messages.length > MAX_MESSAGES) {
throw new Error('Too many messages');
} else {
next();
}
});
EDIT: The reason the function did not fire was because I was using findOneAndUpdate instead of update this is fixed and the function now runs. The solution code above, however, does not work.
The schema with the array looks like this:
const Settings = new mongoose.Schema({
// A lot more fields not relevant to this question
messages: {
type: [{
type: String
}]
}
});
Another thing worth mentioning is that these update statements are used in conjunction with other options. I need the update statement to behave like an update or insert so my complete set of options looks like this:
{
runValidators: true,
setDefaultsOnInsert: true,
upsert: true,
new: true
}
When executing queries with the pre hook set like this, the array limit can be exceeded without any validation error being thrown.
At this point I'm wondering if there is any sensible way to do a max length check like this without having to do it myself outside of mongoose's abstraction layer.
I am using Mongoose 5.2.6 running on node v9.11.1 with MongoDB 4.0.0.
Any help is much appreciated!
Well if you are using latest version from mongodb and mongoose then you can use $expr operator
const udpate = await db.collection.update(
{ $expr: { $gt: [{"$size": "$messages" }, MAX_MESSAGES] }},
{ update }
)
You should be able to do that with the pre update hook. The thing is that that hook would not by default give you the update being mage so you can verify etc. You have to take it via this.getUpdate():
Settings.pre('update', async function (next) {
var preUpdate = this.getUpdate()
// now inside of the preUpdate you would have your update being made and should have the array in there on which you can check the length
});
To give you an idea in my test schema I had to do something like this on an update with a $set:
this.getUpdate().$set.books.length // gave me 2 which was correct etc
I also had no issues running and hitting the update hook at all. It looks super simple out of the mongoose docs:
AuthorSchema.pre('update', function(next) {
console.log('UPDATE hook fired!')
console.log(this.getUpdate())
next();
});

MongoDB - two updates in sequence overlap each other

We are building size calculation mechanism for our system.
In order to calculate sizes, we start with the first atomic operation - findAndModify - to find the object and add lock properties to it (to prevent another calculations for this object to interact with it and wait till the end, as we could have many parallel calculations - in this case others should be postponed), then we calculate size of specific properties and after this operation - we add metadata to object and delete locks.
However, it seems that sometimes, when we have a lot of multiple calculations for single object (especially when we calculate a lot of objects in parallel), some updates aren't executed.
_size metadata during calculation looks like this:
{
_lockedAt: SomeDate,
_transactionId: 'abc'
}
And after calculation it should look like this:
{
somePropertySize: 123,
anotherPropertySize: 1245,
(...)
_total: 131431523 // Some number
// Notice that both _lockedAt and _transactionId should be missing
}
And this is how our update flow looks like:
return Promise.coroutine(function * () {
yield object.findOneAndUpdate({
'_id': gemId,
'_size._lockedAt': {
$exists: false
}
}, {
$set: {
'_size._lockedAt': moment.utc().toDate(),
'_size._transactionId': transactionId
}
}).then(results => results.value);
// Calculations are performed here, new _size object is built
yield object.findOneAndUpdate({
_id: gemId,
_lockedAt: {
$exists: true // We tried both with and without this property, does not change anything
}
}, {
$set: {
_size: newSizeObject
}
});
})()
Exemplary real-life object JUST before second update (truncated for brevity):
{
title: 11,
description: 2,
detailedSection: 0,
tags: 2
file: 5625898,
_total: 5625913
}
For some reason, when we have multiple calculations next to each other, sometimes (for new objects, without _size property at all), the objects stay with _size object looking exactly as after locking, despite the fact logs show us that everything went well (calculations were complete, new sizes object was calculated and second DB update was called).
We use MongoDB 3.0, two replicaSets. Any ideas on what is happening?
Put the second update after the then so it will wait until the promise resolves:
object.findOneAndUpdate({
'_id': gemId,
'_size._lockedAt': {
$exists: false
}
}, {
$set: {
'_size._lockedAt': moment.utc().toDate(),
'_size._transactionId': transactionId
}
}).then(results => {
// Calculations are performed here, new _size object is built
object.findOneAndUpdate({
_id: gemId,
_lockedAt: {
$exists: true // We tried both with and without this property, does not change anything
}
}, {
$set: {
_size: newSizeObject
}
});
}).catch(err => console.error);
Also make sure you have error handling for your promises using catch.
If you don't really need the lock or transaction fields then I would remove that stuff. If you do need them, something like RethinkDB may work a little better, or PostgresSQL could give real transactions.
All in all, I checked the code very carefully and what was happening in reality, was the fact that completely different part of the code was querying the object from the DB and then, after a few other operations (mine included), it wrote the object to the DB (hence, overwriting my changes).
So, important note for every MongoDB user - please do remember that MongoDB is not transactional, but still atomic, which means that it guarantees that your operation will be persisted, but does not guarantee that data between operations will be persisted.
To sum up, things I learned by this example:
NEVER update whole object in the database with the data obtained from it some time before (e.g. by querying, changing some properties and saving again)
USE $set, $inc, $unset and other special operators. If you have a lot of parameters, use e.g. mongo-dot-notation npm library to flatten your data into $set selector.
If something unexpected is happening with your data (e.g. missing properties after saving) the first thing to investigate is another pending operations on those specific entities
The least probable cause of your problems is MongoDB itself. It's usually code that does not follow atomicity rules (which happens probably with a lot of people used to transactional DBs :)).

Mongoose update multiple geospacial index with no limit

I have some Mongoose Models with geospacial indexes:
var User = new Schema({
"name" : String,
"location" : {
"id" : String,
"name" : String,
"loc" : { type : Array, index : '2d'}
}
});
I'm trying to update all items that are in an area - for instance:
User.update({ "location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } }, { "foo" : "bar" },{ "multi" : true }, function(err){
console.log("done!");
});
However, this appears to only update the first 100 records. Looking at the docs, it appears there is a native limit on finds on geospatial indices for that applies when you don't set a limit.
(from docs:
Use limit() to specify a maximum number of points to return (a default limit of 100 applies if unspecified))
This appears to also apply to updates, regardless of the multi flag, which is a giant drag. If I apply an update, it only updates the first 100.
Right now the only way I can think of to get around this is to do something hideous like this:
Model.find({"location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } },{limit:0},function(err,results){
var ids = results.map(function(r){ return r._id; });
Model.update({"_id" : { $in : ids }},{"foo":"bar"},{multi:true},function(){
console.log("I have enjoyed crippling your server.");
});
});
While I'm not even entirely sure that would work (and it could be mildly optimized by only selecting the _id), I'd really like to avoid keeping an array of n ids in memory, as that number could get very large.
Edit:
The above hack doesn't even work, looks like a find with {limit:0} still returns 100 results. So, in an act of sheer desperation and frustration, I have written a recursive method to paginate through ids, then return them so I can update using the above method. I have added the method as an answer below, but not accepted it in hopes that someone will find a better way.
This is a problem in mongo server core as far as I can tell, so mongoose and node-mongodb-native are not to blame. However, this is really stupid, as geospacial indices is one of the few reasons to use mongo over some other more robust NoSQL stores.
Is there a way to achieve this? Even in node-mongodb-native, or the mongo shell, I can't seem to find a way to set (or in this case, remove by setting to 0) a limit on an update.
I'd love to see this issue fixed, but I can't figure out a way to set a limit on an update, and after extensive research, it doesn't appear to be possible. In addition, the hack in the question doesn't even work, I still only get 100 records with a find and limit set to 0.
Until this is fixed in mongo, here's how I'm getting around it: (!!WARNING: UGLY HACKS AHEAD:!!)
var getIdsPaginated = function(query,batch,callback){
// set a default batch if it isn't passed.
if(!callback){
callback = batch;
batch = 10000;
}
// define our array and a find method we can call recursively.
var all = [],
find = function(skip){
// skip defaults to 0
skip = skip || 0;
this.find(query,['_id'],{limit:batch,skip:skip},function(err,items){
if(err){
// if an error is thrown, call back with it and how far we got in the array.
callback(err,all);
} else if(items && items.length){
// if we returned any items, grab their ids and put them in the 'all' array
var ids = items.map(function(i){ return i._id.toString(); });
all = all.concat(ids);
// recurse
find.call(this,skip+batch);
} else {
// we have recursed and not returned any ids. This means we have them all.
callback(err,all);
}
}.bind(this));
};
// start the recursion
find.call(this);
}
This method will return a giant array of _ids. Because they are already indexed, it's actually pretty fast, but it's still calling the db many more times than is necessary. When this method calls back, you can do an update with the ids, like this:
Model.update(ids,{'foo':'bar'},{multi:true},function(err){ console.log('hooray, more than 100 records updated.'); });
This isn't the most elegant way to solve this problem, you can tune it's efficiency by setting the batch based on expected results, but obviously the ability to simply call update (or find for that matter) on $near queries without a limit would really help.

Resources