MongoDB nested array update multiple documents [duplicate] - node.js

I am trying to update a value in the nested array but can't get it to work.
My object is like this
{
"_id": {
"$oid": "1"
},
"array1": [
{
"_id": "12",
"array2": [
{
"_id": "123",
"answeredBy": [], // need to push "success"
},
{
"_id": "124",
"answeredBy": [],
}
],
}
]
}
I need to push a value to "answeredBy" array.
In the below example, I tried pushing "success" string to the "answeredBy" array of the "123 _id" object but it does not work.
callback = function(err,value){
if(err){
res.send(err);
}else{
res.send(value);
}
};
conditions = {
"_id": 1,
"array1._id": 12,
"array2._id": 123
};
updates = {
$push: {
"array2.$.answeredBy": "success"
}
};
options = {
upsert: true
};
Model.update(conditions, updates, options, callback);
I found this link, but its answer only says I should use object like structure instead of array's. This cannot be applied in my situation. I really need my object to be nested in arrays
It would be great if you can help me out here. I've been spending hours to figure this out.
Thank you in advance!

General Scope and Explanation
There are a few things wrong with what you are doing here. Firstly your query conditions. You are referring to several _id values where you should not need to, and at least one of which is not on the top level.
In order to get into a "nested" value and also presuming that _id value is unique and would not appear in any other document, you query form should be like this:
Model.update(
{ "array1.array2._id": "123" },
{ "$push": { "array1.0.array2.$.answeredBy": "success" } },
function(err,numAffected) {
// something with the result in here
}
);
Now that would actually work, but really it is only a fluke that it does as there are very good reasons why it should not work for you.
The important reading is in the official documentation for the positional $ operator under the subject of "Nested Arrays". What this says is:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Specifically what that means is the element that will be matched and returned in the positional placeholder is the value of the index from the first matching array. This means in your case the matching index on the "top" level array.
So if you look at the query notation as shown, we have "hardcoded" the first ( or 0 index ) position in the top level array, and it just so happens that the matching element within "array2" is also the zero index entry.
To demonstrate this you can change the matching _id value to "124" and the result will $push an new entry onto the element with _id "123" as they are both in the zero index entry of "array1" and that is the value returned to the placeholder.
So that is the general problem with nesting arrays. You could remove one of the levels and you would still be able to $push to the correct element in your "top" array, but there would still be multiple levels.
Try to avoid nesting arrays as you will run into update problems as is shown.
The general case is to "flatten" the things you "think" are "levels" and actually make theses "attributes" on the final detail items. For example, the "flattened" form of the structure in the question should be something like:
{
"answers": [
{ "by": "success", "type2": "123", "type1": "12" }
]
}
Or even when accepting the inner array is $push only, and never updated:
{
"array": [
{ "type1": "12", "type2": "123", "answeredBy": ["success"] },
{ "type1": "12", "type2": "124", "answeredBy": [] }
]
}
Which both lend themselves to atomic updates within the scope of the positional $ operator
MongoDB 3.6 and Above
From MongoDB 3.6 there are new features available to work with nested arrays. This uses the positional filtered $[<identifier>] syntax in order to match the specific elements and apply different conditions through arrayFilters in the update statement:
Model.update(
{
"_id": 1,
"array1": {
"$elemMatch": {
"_id": "12","array2._id": "123"
}
}
},
{
"$push": { "array1.$[outer].array2.$[inner].answeredBy": "success" }
},
{
"arrayFilters": [{ "outer._id": "12" },{ "inner._id": "123" }]
}
)
The "arrayFilters" as passed to the options for .update() or even
.updateOne(), .updateMany(), .findOneAndUpdate() or .bulkWrite() method specifies the conditions to match on the identifier given in the update statement. Any elements that match the condition given will be updated.
Because the structure is "nested", we actually use "multiple filters" as is specified with an "array" of filter definitions as shown. The marked "identifier" is used in matching against the positional filtered $[<identifier>] syntax actually used in the update block of the statement. In this case inner and outer are the identifiers used for each condition as specified with the nested chain.
This new expansion makes the update of nested array content possible, but it does not really help with the practicality of "querying" such data, so the same caveats apply as explained earlier.
You typically really "mean" to express as "attributes", even if your brain initially thinks "nesting", it's just usually a reaction to how you believe the "previous relational parts" come together. In reality you really need more denormalization.
Also see How to Update Multiple Array Elements in mongodb, since these new update operators actually match and update "multiple array elements" rather than just the first, which has been the previous action of positional updates.
NOTE Somewhat ironically, since this is specified in the "options" argument for .update() and like methods, the syntax is generally compatible with all recent release driver versions.
However this is not true of the mongo shell, since the way the method is implemented there ( "ironically for backward compatibility" ) the arrayFilters argument is not recognized and removed by an internal method that parses the options in order to deliver "backward compatibility" with prior MongoDB server versions and a "legacy" .update() API call syntax.
So if you want to use the command in the mongo shell or other "shell based" products ( notably Robo 3T ) you need a latest version from either the development branch or production release as of 3.6 or greater.
See also positional all $[] which also updates "multiple array elements" but without applying to specified conditions and applies to all elements in the array where that is the desired action.

I know this is a very old question, but I just struggled with this problem myself, and found, what I believe to be, a better answer.
A way to solve this problem is to use Sub-Documents. This is done by nesting schemas within your schemas
MainSchema = new mongoose.Schema({
array1: [Array1Schema]
})
Array1Schema = new mongoose.Schema({
array2: [Array2Schema]
})
Array2Schema = new mongoose.Schema({
answeredBy": [...]
})
This way the object will look like the one you show, but now each array are filled with sub-documents. This makes it possible to dot your way into the sub-document you want. Instead of using a .update you then use a .find or .findOne to get the document you want to update.
Main.findOne((
{
_id: 1
}
)
.exec(
function(err, result){
result.array1.id(12).array2.id(123).answeredBy.push('success')
result.save(function(err){
console.log(result)
});
}
)
Haven't used the .push() function this way myself, so the syntax might not be right, but I have used both .set() and .remove(), and both works perfectly fine.

Related

Differences between $or and $in when querying Strings in MongoDB / mongoose

While building an API, I need to match documents that contain pending or active values for the key status.
When trying
args.status = {
$or: [
'active',
'pending'
]
}
I get an error: cannot use $or with string
However,
args.status = {
$in: [
'active',
'pending'
]
}
works just fine.
I would expect $or to work here. Can someone provide context on the differences between the two and why Strings require $in?
$or performs the logical OR operation on an ARRAY with more than two expressions e.g. {$or:[{name:"a"},{name:"b"}]} This query will return the record which are having either name 'a' or 'b'.
$in works on the array and return the documents which are which contains any of the field from your specified array e.g.{name:{$in:['a','b']}} This query will return the documents where name is either 'a' or 'b'.
Ideally both are doing same but just having the syntax difference.
In your case you have to modify your OR query syntax and add the condition expessions in an ARRAY.
{ $or: [
{
"args.status": "active"
},
{
"args.status": "pending"
}
]
}
Thats because $or expects array of objects. Objects that defines some filters out of which at least one needs to be match to return the result. For your particular scenario $in is the best option. Still if you wanna go with $or, the query will be like:
{
$or: [
{'args.status' : {$eq: 'active'}},
{'args.status' : {$eq: 'pending'}}
]
}
I'd suggest you stick with $in as it is the best fit for your requirement.
You can check the official docs for more details on $or
Hope this helps :)

Upsert and $inc Sub-document in Array

The following schema is intended to record total views and views for a very specific day only.
const usersSchema = new Schema({
totalProductsViews: {type: Number, default: 0},
productsViewsStatistics: [{
day: {type: String, default: new Date().toISOString().slice(0, 10), unique: true},
count: {type: Number, default: 0}
}],
});
So today views will be stored in another subdocument different from yesterday. To implement this I tried to use upsert so as subdocument will be created each day when product is viewed and counts will be incremented and recorded based on a particular day. I tried to use the following function but seems not to work the way I intended.
usersSchema.statics.increaseProductsViews = async function (id) {
//Based on day only.
const todayDate = new Date().toISOString().slice(0, 10);
const result = await this.findByIdAndUpdate(id, {
$inc: {
totalProductsViews: 1,
'productsViewsStatistics.$[sub].count': 1
},
},
{
upsert: true,
arrayFilters: [{'sub.day': todayDate}],
new: true
});
console.log(result);
return result;
};
What do I miss to get the functionality I want? Any help will be appreciated.
What you are trying to do here actually requires you to understand some concepts you may not have grasped yet. The two primary ones being:
You cannot use any positional update as part of an upsert since it requires data to be present
Adding items into arrays mixed with "upsert" is generally a problem that you cannot do in a single statement.
It's a little unclear if "upsert" is your actual intention anyway or if you just presumed that was what you had to add in order to get your statement to work. It does complicate things if that is your intent, even if it's unlikely give the finByIdAndUpdate() usage which would imply you were actually expecting the "document" to be always present.
At any rate, it's clear you actually expect to "Update the array element when found, OR insert a new array element where not found". This is actually a two write process, and three when you consider the "upsert" case as well.
For this, you actually need to invoke the statements via bulkWrite():
usersSchema.statics.increaseProductsViews = async function (_id) {
//Based on day only.
const todayDate = new Date().toISOString().slice(0, 10);
await this.bulkWrite([
// Try to match an existing element and update it ( do NOT upsert )
{
"updateOne": {
"filter": { _id, "productViewStatistics.day": todayDate },
"update": {
"$inc": {
"totalProductsViews": 1,
"productViewStatistics.$.count": 1
}
}
}
},
// Try to $push where the element is not there but document is - ( do NOT upsert )
{
"updateOne": {
"filter": { _id, "productViewStatistics.day": { "$ne": todayDate } },
"update": {
"$inc": { "totalProductViews": 1 },
"$push": { "productViewStatistics": { "day": todayDate, "count": 1 } }
}
}
},
// Finally attempt upsert where the "document" was not there at all,
// only if you actually mean it - so optional
{
"updateOne": {
"filter": { _id },
"update": {
"$setOnInsert": {
"totalProductViews": 1,
"productViewStatistics": [{ "day": todayDate, "count": 1 }]
}
}
}
])
// return the modified document if you really must
return this.findById(_id); // Not atomic, but the lesser of all evils
}
So there's a real good reason here why the positional filtered [<identifier>] operator does not apply here. The main good reason is the intended purpose is to update multiple matching array elements, and you only ever want to update one. This actually has a specific operator in the positional $ operator which does exactly that. It's condition however must be included within the query predicate ( "filter" property in UpdateOne statements ) just as demonstrated in the first two statements of the bulkWrite() above.
So the main problems with using positional filtered [<identifier>] are that just as the first two statements show, you cannot actually alternate between the $inc or $push as would depend on if the document actually contained an array entry for the day. All that will happen is at best no update will be applied when the current day is not matched by the expression in arrayFilters.
The at worst case is an actual "upsert" will throw an error due to MongoDB not being able to decipher the "path name" from the statement, and of course you simply cannot $inc something that does not exist as a "new" array element. That needs a $push.
That leaves you with the mechanic that you also cannot do both the $inc and $push within a single statement. MongoDB will error that you are attempting to "modify the same path" as an illegal operation. Much the same applies to $setOnInsert since whilst that operator only applies to "upsert" operations, it does not preclude the other operations from happening.
Thus the logical steps fall back to what the comments in the code also describe:
Attempt to match where the document contains an existing array element, then update that element. Using $inc in this case
Attempt to match where the document exists but the array element is not present and then $push a new element for the given day with the default count, updating other elements appropriately
IF you actually did intend to upsert documents ( not array elements, because that's the above steps ) then finally actually attempt an upsert creating new properties including a new array.
Finally there is the issue of the bulkWrite(). Whilst this is a single request to the server with a single response, it still is effectively three ( or two if that's all you need ) operations. There is no way around that and it is better than issuing chained separate requests using findByIdAndUpdate() or even updateOne().
Of course the main operational difference from the perspective of code you attempted to implement is that method does not return the modified document. There is no way to get a "document response" from any "Bulk" operation at all.
As such the actual "bulk" process will only ever modify a document with one of the three statements submitted based on the presented logic and most importantly the order of those statements, which is important. But if you actually wanted to "return the document" after modification then the only way to do that is with a separate request to fetch the document.
The only caveat here is that there is the small possibility that other modifications could have occurred to the document other than the "array upsert" since the read and update are separated. There really is no way around that, without possibly "chaining" three separate requests to the server and then deciding which "response document" actually applied the update you wanted to achieve.
So with that context it's generally considered the lesser of evils to do the read separately. It's not ideal, but it's the best option available from a bad bunch.
As a final note, I would strongly suggest actually storing the the day property as a BSON Date instead of as a string. It actually takes less bytes to store and is far more useful in that form. As such the following constructor is probably the clearest and least hacky:
const todayDate = new Date(new Date().setUTCHours(0,0,0,0))

How to fuzzy query against multiple fields in elasticsearch?

Here's my query as it stands:
"query":{
"fuzzy":{
"author":{
"value":query,
"fuzziness":2
},
"career_title":{
"value":query,
"fuzziness":2
}
}
}
This is part of a callback in Node.js. Query (which is being plugged in as a value to compare against) is set earlier in the function.
What I need it to be able to do is to check both the author and the career_title of a document, fuzzily, and return any documents that match in either field. The above statement never returns anything, and whenever I try to access the object it should create, it says it's undefined. I understand that I could write two queries, one to check each field, then sort the results by score, but I feel like searching every object for one field twice will be slower than searching every object for two fields once.
https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzzy-match-query.html
If you see here, in a multi match query you can specify the fuzziness...
{
"query": {
"multi_match": {
"fields": [ "text", "title" ],
"query": "SURPRIZE ME!",
"fuzziness": "AUTO"
}
}
}
Somewhat like this.. Hope this helps.

Mongoose/Mongodb previous and next in embedded document

I'm learning Mongodb/Mongoose/Express and have come across a fairly complex query (relative to my current level of understanding anyway) that I'm not sure how best to approach. I have a collection - to keep it simple let's call it entities - with an embedded actions array:
name: String
actions: [{
name: String
date: Date
}]
What I'd like to do is to return an array of documents with each containing the most recent action (or most recent to a specified date), and the next action (based on the same date).
Would this be possible with one find() query, or would I need to break this down into multiple queries and merge the results to generate one result array? I'm looking for the most efficient route possible.
Provided that your "actions" are inserted with the "most recent" being the last entry in the list, and usually this will be the case unless you are specifically updating items and changing dates, then all you really want to do is "project" the last item of the array. This is what the $slice projection operation is for:
Model.find({},{ "actions": { "$slice": -1 } },function(err,docs) {
// contains an array with the last item
});
If indeed you are "updating" array items and changing dates, but you want to query for the most recent on a regular basis, then you are probably best off keeping the array ordered. You can do this with a few modifiers such as:
Model.update(
{
"_id": ObjectId("541f7bbb699e6dd5a7caf2d6"),
},
{
"$push": { "actions": { "$each": [], "$sort": { "date": 1 } } }
},
function(err,numAffected) {
}
);
Which is actually more of a trick that you can do with the $sort modifier to simply sort the existing array elements without adding or removing. In versions prior to 2.6 you need the $slice "update" modifier in here as well, but this could be set to a value larger than the expected array elements if you did not actually want to restrict the possible size, but that is probably a good idea.
Unfortunately, if you were "updating" via a $set statement, then you cannot do this "sorting" in a single update statement, as MongoDB will not allow both types of operations on the array at once. But if you can live with that, then this is a way to keep the array ordered so the first query form works.
If it just seems too hard to keep an array ordered by date, then you can in fact retrieve the largest value my means of the .aggregate() method. This allows greater manipulation of the documents than is available to basic queries, at a little more cost:
Model.aggregate([
// Unwind the array to de-normalize as documents
{ "$unwind": "$actions" },
// Sort the contents per document _id and inner date
{ "$sort": { "_id": 1, "actions.date": 1 } },
// Group back with the "last" element only
{ "$group": {
"_id": "$_id",
"name": { "$last": "$name" },
"actions": { "$last": "$actions" }
}}
],
function(err,docs) {
})
And that will "pull apart" the array using the $unwind operator, then process with a next stage to $sort the contents by "date". In the $group pipeline stage the "_id" means to use the original document key to "collect" on, and the $last operator picks the field values from the "last" document ( de-normalized ) on that grouping boundary.
So there are various things that you can do, but of course the best way is to keep your array ordered and use the basic projection operators to simply get the last item in the list.

Can I fetch last inserted subdocument in Mongoose using pop()?

I am doing a findOneAndUpdate call to add to a subdocument array using $addToSet. The operation returns the newly updated document. I would like to extract the subdocument that just got inserted. Is it safe to use pop() in this case? I know that browsers vary in how arrays/objects are ordered but not sure about the node/mongo case.
The general official line for "sets" in MongoDB is that they are not considered to be ordered in any way. There actually has been some discussion that "sets" may even constitute a different data type to that of an ordinary array at some point, but as yet there is no distinction made between types and they are internally treated just as another array.
It would appear however, that the general behavior actually is that any "new" item added to the set is indeed "appended" to the list of current elements, so anything new would indeed be the last element in an array. And arrays always do retain positions, which is the general point of the data type.
So whilst your point is basically asking:
"will the last added item be the last in the list?"
Then that would appear to be the case despite the given cases warning about the order of elements, which would generally appear to be from surface value more about the "internal" ordering, where adding "a" after "c" would not place the "a" element first in the list without additional modification.
But the question also comes off a little bit "funny" in that you are essentially asking:
"did my last update to the set appear on the end?"
Which IMHO seems to be really asking "Did it update or not?", or did the last member I added actually update the "set" or did the element already exist. In this case, rather than looking to see if the last element is the same as the one you wanted to add, the better option would be to check the WriteResult and see if the document was actually modified.
Consider this document:
{
"list": [ "c", "a" ]
}
And then issuing the the update with the Bulk Operations API to get the result:
var bulk = Model.collection("settest").initializeOrderedBulkOp();
bulk.find({}).updateOne({ "$addToSet": { "list": "a" } });
bulk.execute(function(err,result) {
console.log( JSON.stringify( result, undefined, 4 ) );
});
Would give you something like this:
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 0,
"nUpserted" : 0,
"nMatched" : 1,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
The enhanced response tells you that even though 1 document was matched by the update statement, nothing was actually modified in the document itself since the $addToSet operation actually did not do anything where the element was already present in the set.
If you however considered the mongoose .findOneAndUpdate() method, which is derived from the basic underlying .findAndModify() method of the base driver, then the response you get back is the "modified" document by default:
Model.findOneAndUpdate(
{},
{ "$addToSet": { "list": "a" },
function(err,doc) {
console.log( JSON.stringify( doc, undefined, 4 ) );
}
);
So that is basically the same document as before, and there is no way to tell if the update operation actually did anything or not.
Now you could modify that to ask for the original document instead of the modified one:
Model.findOneAndUpdate(
{},
{ "$addToSet": { "list": "a" },
{ "new": false },
function(err,doc) {
console.log( JSON.stringify( doc, undefined, 4 ) );
}
);
But the only thing you really know about the result is that if the element you last added is not present in the original document, then that resulted in an update. Of course it is there "now" because you asked it to do that and the operation succeeded and returned you a document:
Model.findOneAndUpdate(
{},
{ "$addToSet": { "list": "a" },
{ "new": false },
function(err,doc) {
if ( doc.list.indexOf("a") == -1 )
console.log( "updated set" );
}
);
But is the "last element" in the list where added? As covered earlier, it would appear so, but whether this always remains the same is the question. But it really does not make a lot of sense to do so as you cannot guarantee that the operation you just did actually did anything without checking for it properly.
So in an atomic operation such as .findOneAndUpdate() where you get the document returned, or indeed in any "update" variant, the better option to "did this do anything" is to actually check the Write Result or the original document to what you expect the result of the operation to be.
In the end, if you need the whole document back then use .findOneAndUpdate() but with the "original document" option. You can safely assume that your operation succeeded and add the element to the list where it was not present originally if that is your desired result. The difference tells you if the update actually did anything.
If however you only want to know if a new element was indeed added, then just use the Bulk API approach, which gives you the correct response that the document was not updated since the element was already part of the "set".

Resources