Mongoose find for array elements with limit - node.js

I'm using https://mongoosejs.com/ for querying mongo
I want to find data of array elements.Like this:
var a = ["a","b","c"];
topic.find({topic:a}).limit(4).exec(.....
If I use like this, I can find only for a element; but I need altogether. That means:
limit for a=4 limit for b=4 limit c=4
At stackoverflow you ask a question and we answer the question. Maybe one answer may be 2 may be 3 and stackoverflow send all comments of answers with limit.And i want to do this.

You could use $facet for this which will give you a single result document:
db.collection.aggregate({
$facet: {
"a": [{ $match: { "topic": "a" } }, { $limit: 4 }],
"b": [{ $match: { "topic": "b" } }, { $limit: 4 }],
"c": [{ $match: { "topic": "c" } }, { $limit: 4 }],
}
})
If you need separate documents you would probably append the following stages at the end of the above pipeline:
{
$project: {
"result": { $concatArrays: [ "$a", "$b", "$c" ] }
}
}, {
$unwind: "$result"
}

I think the solution here would be to concatenate the results.
const aIds = topic.find({topic:"a"}).limit(4).map( el => el._id );
const bIds = topic.find({topic:"b"}).limit(4).map( el => el._id );
const cIds = topic.find({topic:"c"}).limit(4).map( el => el._id );
topic.find({_id: {$in: aIds.concat(bIds).concat(cIds) }}).exec(.....
Not the most efficient, but it will work.
UPDATE
Initially, I was not aware that this question related to the mongoose.
I'm not familiar with it but the idea remains the same:
find 3 topics (a,b,c) each with its own limit then join them.

Related

How can I write query in mongodb?

I have a collection of mongodb like this :
[{
"_id":"ObjectId(""51780fb5c9c41825e3e21fc4"")",
"name":"CS 101",
"students":[
{
"name":"raj",
"year":2016
},
{
"name":"rahul",
"year":2017
},
{
"name":"anil",
"year":2018
}
]
},
{
"_id":"ObjectId(""51780fb5c9c41825e3e21fs4"")",
"name":"CS 102",
"students":[
{
"name":"mukesh",
"year":2016
},
{
"name":"mohan",
"year":2017
},
{
"name":"mangal",
"year":2018
}
]
}
]
I've been looking for similar questions like this one: Mongo db - Querying nested array and objects but in that question they're looking for a specific element inside the "messages" object (in my case) for example. Same as in this other question: Query for a field in an object in an array with Mongo? where they're using $mapan d I don't think it fits my needs.
The documents to find have this structure:
[{
"_id":"ObjectId(""51780fb5c9c41825e3e21fc4"")",
"name":"CS 101",
"students":[
"raj","rahul","anil"
]
},
{
"_id":"ObjectId(""51780fb5c9c41825e3e21fs4"")",
"name":"CS 102",
"students":[
"mukesh","mohan","mangal"
]
}
]
how to solve this?
From the question and datasets, you are trying to return students with an array of student's name (string) instead of the array of student object.
Use $project to display students as students.name array.
db.collection.aggregate([
{
$project: {
"_id": "$_id",
"name": "$name",
"students": "$students.name"
}
}
])
Sample Solution 1 on Mongo Playground
OR
Use $set to replace the students field with students.name array.
db.collection.aggregate([
{
$set: {
"students": "$students.name"
}
}
])
Sample Solution 2 on Mongo Playground

Mongoose - Query deeply nested Objects

I currently have a problem where I have to update entries in a deeply nested Document. Now to simplify my problem I have this example. Let's assume I store cars in my MongoDB. A Document would look like this
{
Make: "BMW",
Model: "3Series",
Wheels: [
{
_id: someObjectId
Size: "19 inch",
Screws: [
{
_id: someObjectId
Type : "M15x40"
},
{
_id: someObjectId
Type : "M15x40"
}
]
}
]
}
Now if I want to update a specific Wheel, my code would look somewhat like this
CarModel.findOneAndUpdate({
"_id": CarId, "Wheels._id": WheelId
}, {
"$set" : {
"Wheels.$.Size": NewSize
}
})
Now this works. But I am pretty lost on how I would update an specific screw as I am going through 2 Arrays. Any Idea how I could make this work?
You need arrayFilters functionality to define the path for more than one nested array:
CarModel.findOneAndUpdate(
{ "_id": CarId },
{ $set: { "Wheels.$[wheel].Screws.$[screw].Type": "something" } },
{ arrayFilters: [ { 'wheel._id': WheelId }, { 'screw._id': screwId } ] })

Use $lookup with a Conditional Join

provided I have following documents
User
{
uuid: string,
isActive: boolean,
lastLogin: datetime,
createdOn: datetime
}
Projects
{
id: string,
users: [
{
uuid: string,
otherInfo: ...
},
{... more users}
]
}
And I want to select all users that didn't login since 2 weeks and are inactive or since 5 weeks that don't have projects.
Now, the 2 weeks is working fine but I cannot seem to figure out how to do the "5 weeks and don't have projects" part
I came up with something like below but the last part does not work because $exists obviously is not a top level operator.
Anyone ever did anything like this?
Thanks!
return await this.collection
.aggregate([
{
$match: {
$and: [
{
$expr: {
$allElementsTrue: {
$map: {
input: [`$lastLogin`, `$createdOn`],
in: { $lt: [`$$this`, twoWeeksAgo] }
}
}
}
},
{
$or: [
{
isActive: false
},
{
$and: [
{
$expr: {
$allElementsTrue: {
$map: {
input: [`$lastLogin`, `$createdOn`],
in: { $lt: [`$$this`, fiveWeeksAgo] }
}
}
}
},
{
//No projects exists on this user
$exists: {
$lookup: {
from: _.get(Config, `env.collection.projects`),
let: {
currentUser: `$$ROOT`
},
pipeline: [
{
$project: {
_id: 0,
users: {
$filter: {
input: `$users`,
as: `user`,
cond: {
$eq: [`$$user.uuid`, `$currentUser.uuid`]
}
}
}
}
}
]
}
}
}
]
}
]
}
]
}
}
])
.toArray();
Not certain why you thought $expr was needed in the initial $match, but really:
const getResults = () => {
const now = Date.now();
const twoWeeksAgo = new Date(now - (1000 * 60 * 60 * 24 * 7 * 2 ));
const fiveWeeksAgo = new Date(now - (1000 * 60 * 60 * 24 * 7 * 5 ));
// as long a mongoDriverCollectionReference points to a "Collection" object
// for the "users" collection
return mongoDriverCollectionReference.aggregate([
// No $expr, since you can actually use an index. $expr cannot do that
{ "$match": {
"$or": [
// Active and "logged in"/created in the last 2 weeks
{
"isActive": true,
"$or": [
{ "lastLogin": { "$gte": twoWeeksAgo } },
{ "createdOn": { "$gte": twoWeeksAgo } }
]
},
// Also want those who...
// Not Active and "logged in"/created in the last 5 weeks
// we'll "tag" them later
{
"isActive": false,
"$or": [
{ "lastLogin": { "$gte": fiveWeeksAgo } },
{ "createdOn": { "$gte": fiveWeeksAgo } }
]
}
]
}},
// Now we do the "conditional" stuff, just to return a matching result or not
{ "$lookup": {
"from": _.get(Config, `env.collection.projects`), // there are a lot cleaner ways to register models than this
"let": {
"uuid": {
"$cond": {
"if": "$isActive", // this is boolean afterall
"then": null, // don't really want to match
"else": "$uuid" // Okay to match the 5 week results
}
}
},
"pipeline": [
// Nothing complex here as null will return nothing. Just do $in for the array
{ "$match": { "$in": [ "$$uuid", "$users.uuid" ] } },
// Don't really need the detail, so just reduce any matches to one result of [null]
{ "$group": { "_id": null } }
],
"as": "projects"
}},
// Now test if the $lookup returned something where it mattered
{ "$match": {
"$or": [
{ "active": true }, // remember we selected the active ones already
{
"projects.0": { "$exists": false } // So now we only need to know the "inactive" returned no array result.
}
]
}}
]).toArray(); // returns a Promise
};
It's pretty simple as calculated expressions via $expr are actually really bad and not what you want in a first pipeline stage. Also "not what you need" since createdOn and lastLogin really should not have been merged into an array for $allElementsTrue which would just be an AND condition, where you described logic would really mean OR. So the $or does just fine here.
So does the $or on the separation of conditions for the isActive of true/false. Again it's either "two weeks" OR "five weeks". And this certainly does not need $expr since standard inequality range matching works fine, and uses an "index".
Then you really just want to do the "conditional" things in the let for $lookup instead of your "does it exist" thinking. All you really need to know ( since the range selection of dates is actually already done ) is whether active is now true or false. Where it's active ( meaning by your logic you don't care about projects ) simply make the $$uuid used within the $match pipeline stage a null value so it will not match and the $lookup returns an empty array. Where false ( also already matching the date conditions from earlier ) then you use the actual value and "join" ( where there are projects of course ).
Then it's just a simple matter of keeping the active users, and then only testing the remaining false values for active to see if the "projects" array from the $lookup actually returned anything. If it did not, then they just don't have projects.
Probably should note here is since users is an "array" within the projects collection, you use $in for the $match condition against the single value to the array.
Note that for brevity we can use $group inside the inner pipeline to only return one result instead of possibly many matches to actual matched projects. You don't care about the content or the "count", but simply if one was returned or nothing. Again following the presented logic.
This gets you your desired results, and it does so in a manner that is efficient and actually uses indexes where available.
Also return await certainly does not do what you think it does, and in fact it's an ESLint warning message ( I suggest you enable ESLint in your project ) since it's not a smart thing to do. It does nothing really, as you would need to await getResults() ( as per the example naming ) anyway, as the await keyword is not "magic" but just a prettier way of writing then(). As well as hopefully being easier to understand, once you understand what async/await is really for syntactically that is.

Push if not present or update a nested array mongoose [duplicate]

I have documents that looks something like that, with a unique index on bars.name:
{ name: 'foo', bars: [ { name: 'qux', somefield: 1 } ] }
. I want to either update the sub-document where { name: 'foo', 'bars.name': 'qux' } and $set: { 'bars.$.somefield': 2 }, or create a new sub-document with { name: 'qux', somefield: 2 } under { name: 'foo' }.
Is it possible to do this using a single query with upsert, or will I have to issue two separate ones?
Related: 'upsert' in an embedded document (suggests to change the schema to have the sub-document identifier as the key, but this is from two years ago and I'm wondering if there are better solutions now.)
No there isn't really a better solution to this, so perhaps with an explanation.
Suppose you have a document in place that has the structure as you show:
{
"name": "foo",
"bars": [{
"name": "qux",
"somefield": 1
}]
}
If you do an update like this
db.foo.update(
{ "name": "foo", "bars.name": "qux" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then all is fine because matching document was found. But if you change the value of "bars.name":
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then you will get a failure. The only thing that has really changed here is that in MongoDB 2.6 and above the error is a little more succinct:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 16836,
"errmsg" : "The positional operator did not find the match needed from the query. Unexpanded update: bars.$.somefield"
}
})
That is better in some ways, but you really do not want to "upsert" anyway. What you want to do is add the element to the array where the "name" does not currently exist.
So what you really want is the "result" from the update attempt without the "upsert" flag to see if any documents were affected:
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } }
)
Yielding in response:
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })
So when the modified documents are 0 then you know you want to issue the following update:
db.foo.update(
{ "name": "foo" },
{ "$push": { "bars": {
"name": "xyz",
"somefield": 2
}}
)
There really is no other way to do exactly what you want. As the additions to the array are not strictly a "set" type of operation, you cannot use $addToSet combined with the "bulk update" functionality there, so that you can "cascade" your update requests.
In this case it seems like you need to check the result, or otherwise accept reading the whole document and checking whether to update or insert a new array element in code.
if you dont mind changing the schema a bit and having a structure like so:
{ "name": "foo", "bars": { "qux": { "somefield": 1 },
"xyz": { "somefield": 2 },
}
}
You can perform your operations in one go.
Reiterating 'upsert' in an embedded document for completeness
I was digging for the same feature, and found that in version 4.2 or above, MongoDB provides a new feature called Update with aggregation pipeline.
This feature, if used with some other techniques, makes possible to achieve an upsert subdocument operation with a single query.
It's a very verbose query, but I believe if you know that you won't have too many records on the subCollection, it's viable. Here's an example on how to achieve this:
const documentQuery = { _id: '123' }
const subDocumentToUpsert = { name: 'xyz', id: '1' }
collection.update(documentQuery, [
{
$set: {
sub_documents: {
$cond: {
if: { $not: ['$sub_documents'] },
then: [subDocumentToUpsert],
else: {
$cond: {
if: { $in: [subDocumentToUpsert.id, '$sub_documents.id'] },
then: {
$map: {
input: '$sub_documents',
as: 'sub_document',
in: {
$cond: {
if: { $eq: ['$$sub_document.id', subDocumentToUpsert.id] },
then: subDocumentToUpsert,
else: '$$sub_document',
},
},
},
},
else: { $concatArrays: ['$sub_documents', [subDocumentToUpsert]] },
},
},
},
},
},
},
])
There's a way to do it in two queries - but it will still work in a bulkWrite.
This is relevant because in my case not being able to batch it is the biggest hangup. With this solution, you don't need to collect the result of the first query, which allows you to do bulk operations if you need to.
Here are the two successive queries to run for your example:
// Update subdocument if existing
collection.updateMany({
name: 'foo', 'bars.name': 'qux'
}, {
$set: {
'bars.$.somefield': 2
}
})
// Insert subdocument otherwise
collection.updateMany({
name: 'foo', $not: {'bars.name': 'qux' }
}, {
$push: {
bars: {
somefield: 2, name: 'qux'
}
}
})
This also has the added benefit of not having corrupted data / race conditions if multiple applications are writing to the database concurrently. You won't risk ending up with two bars: {somefield: 2, name: 'qux'} subdocuments in your document if two applications run the same queries at the same time.

How to make a query using Mongoose that gets N results, but combines any documents it finds that meet certain criteria?

I have a Comments collection in Mongoose, and a query that returns the most recent five (an arbitrary number) Comments.
Every Comment is associated with another document. What I would like to do is make a query that returns the most recent 5 comments, with comments associated with the same other document combined.
So instead of a list like this:
results = [
{ _id: 123, associated: 12 },
{ _id: 122, associated: 8 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
{ _id: 119, associated: 17 }
]
I'd like to return a list like this:
results = [
{ _id: 124, associated: 3 },
{ _id: 125, associated: 19 },
[
{ _id: 123, associated: 12 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
],
{ _id: 122, associated: 8 },
{ _id: 119, associated: 17 }
]
Please don't worry too much about the data format: it's just a sketch to try to show the sort of thing I want. I want a result set of a specified size, but with some results grouped according to some criterion.
Obviously one way to do this would be to just make the query, crawl and modify the results, then recursively make the query again until the result set is as long as desired. That way seems awkward. Is there a better way to go about this? I'm having trouble phrasing it in a Google search in a way that gets me anywhere near anyone who might have insight.
Here's an aggregation pipeline query that will do what you are asking for:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 }
])
Lacking any other fields from the sample data to sort by, I used $_id.
If you'd like results that are a little closer in structure to the sample result set you provided you could add a $project to the end:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will print only the result set. Note that even comments that do not share an association object will be in an array. It will be an array of 1 length.
If you are concerned about limiting the results in the grouping as Neil Lunn is suggesting, perhaps a $match in the beginning is a smart idea.
db.comments.aggregate([
{ $match: { createDate: { $gte: new Date(new Date() - 5 * 60000) } } },
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will only include comments made in the last 5 minutes assuming you have a createDate type field. If you do, you might also consider using that as the field to sort by instead of "_id". If you do not have a createDate type field, I'm not sure how best to limit the comments that are grouped as I do not know of a "current _id" in the way that there is a "current time".
I honestly think you are asking a lot here and cannot really see the utility myself, but I'm always happy to have that explained to me if there is something useful I have missed.
Bottom line is you want comments from the last five distinct users by date, and then some sort of grouping of additional comments by those users. The last part is where I see difficulty in rules no matter how you want to attack this, but I'll try to keep this to the most brief form.
No way this happens in a single query of any sort. But there are things that can be done to make it an efficient server response:
var DataStore = require('nedb'),
store = new DataStore();
async.waterfall(
function(callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId } },
{ "$sort": { "associated": 1, "createdDate": -1 } },
{ "$group": {
"_id": "$associated",
"date": { "$first": "$createdDate" }
}},
{ "$sort": { "date": -1 } },
{ "$limit": 5 }
],
callback);
},
function(docs,callback) {
async.each(docs,function(doc,callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId, "associated": doc._id } },
{ "$sort": { "createdDate": -1 } },
{ "$limit": 5 },
{ "$group": {
"_id": "$associated",
"docs": {
"$push": {
"_id": "$_id", "createdDate": "$createdDate"
}
},
"firstDate": { "$first": "$createdDate" }
}}
],
function(err,results) {
if (err) callback(err);
async.each(results,function(result,callback) {
store.insert( result, function(err, result) {
callback(err);
});
},function(err) {
callback(err);
});
}
);
},
callback);
},
function(err) {
if (err) throw err;
store.find({}).sort({ "firstDate": - 1 }).exec(function(err,docs) {
if (err) throw err;
console.log( JSON.stringify( docs, undefined, 4 ) );
});
}
);
Now I stuck more document properties in both the document and the array, but the simplified form based on your sample would then come out like this:
results = [
{ "_id": 3, "docs": [124] },
{ "_id": 19, "docs": [125] },
{ "_id": 12, "docs": [123,121,120] },
{ "_id": 8, "docs": [122] },
{ "_id": 17, "docs": [119] }
]
So the essential idea is to first find your distinct "users" who where the last to comment by basically chopping off the last 5. Without filtering some kind of range here that would go over the entire collection to get those results, so it would be best to restrict this in some way, as in the last hour or last few hours or something sensible as required. Just add those conditions to the $match along with the current post that is associated with the comments.
Once you have those 5, then you want to get any possible "grouped" details for multiple comments by those users. Again, some sort of limit is generally advised for a timeframe, but as a general case this is just looking for the most recent comments by the user on the current post and restricting that to 5.
The execution here is done in parallel, which will use more resources but is fairly effective considering there are only 5 queries to run anyway. In contrast to your example output, the array here is inside the document result, and it contains the original document id values for each comment for reference. Any other content related to the document would be pushed into the array as well as required (ie The content of the comment).
The other little trick here is using nedb as a means for storing the output of each query in an "in memory" collection. This need only really be a standard hash data structure, but nedb gives you a way of doing that while maintaining the MongoDB statement form that you may be used to.
Once all results are obtained you just return them as your output, and sorted as shown to retain the order of who commented last. The actual comments are grouped in the array for each item and you can traverse this to output how you like.
Bottom line here is that you are asking for a compounded version of the "top N results problem", which is something often asked of MongoDB. I've written about ways to tackle this before to show how it's possible in a single aggregation pipeline stage, but it really is not practical for anything more than a relatively small result set.
If you really want to join in the insanity, then you can look at Mongodb aggregation $group, restrict length of array for one of the more detailed examples. But for my money, I would run on parallel queries any day. Node.js has the right sort of environment to support them, so you would be crazy to do it otherwise.

Resources