mongoose: sort and paginating the field inside $project - node.js

$project: {
_id: 1,
edited: 1,
game: {
gta: {
totalUserNumber: {
$reduce: {
input: "$gta.users",
initialValue: 0,
in: { $add: [{ $size: "$$this" }, "$$value"] },
},
},
userList: "$gta.users", <----- paginating this
},
DOTA2: {
totalUserNumber: {
$reduce: {
input: "$dota2.users",
initialValue: 0,
in: { $add: [{ $size: "$$this" }, "$$value"] },
},
},
userList: "$dota2.users", <------ paginating this
},
},
.... More Games
},
I have this $project. I have paginated the list of games by using $facet,$sort, $skip and $limit after $project.
I am trying also trying to paginate each game's userList. I have done to get the total value in order to calculate the page number and more.
But, I am struggling to apply $sort and $limit inside the $project. So far, I have just returned the document and then paginated with the return value. However, I don't think this is very efficient and wondering if there is any way that I can paginate the field inside the $project.
Is there any way that I can apply $sort and $limit inside the $project, in order to apply pagination to the fields and return?
------ Edit ------
this is for paginating the field. Because, I am already paginating the document (game list), I could not find any way that I can paginate the field, because I could not find any way that I can apply $facet to the field.
e.g. document
[
gta: {
userID: ['aa', 'bb', 'cc' ......],
},
dota: {
userID: ['aa', 'bb', 'cc' ......],
}
....
]
I am using $facet to paginate the list of games (dota, gta, lol and more). However, I did not want to return all the userID. I had to return the entire document and then paginate the userID to replace the json doc.
Now, I can paginate the field inside the aggregate pipeline by using $function.
thanks to Mongodb sort inner array !
const _function = function (e) {
e // <---- will return userID array. You can do what you want to do.
return {
};
};
game
.collection("game")
.aggregate([
{},
{
$set: {
"game": {
$function: {
body: _function,
args: ["$userID"],
lang: "js",
},
},
},
},
])
.toArray();
By using $function multiple time, you will be able to paginate the field. I don' really know if this is faster or not tho. Plus, make sure you can use $function. I read that you can't use this if you are on the free tier at Atlas.

What you are looking for is the $slice Operator.
It requires three parameters.
"$slice": [<Array>, <start-N>, <No-Of.elements to fetch>]
userList: {"$slice": ["$dota2.users", 20, 10]} // <-- Will ignore first 20 elements in array and gets the next 10

Related

How to return field based on other fields with mongoose

I have mongoose schema that looks something like this:
{
_id: someId,
name: 'mike',
keys: {
apiKey: 'fsddsfdsfdsffds',
secretKey: 'sddfsfdsfdsfdsds'
}
}
I don't want to send back to the front the keys of course, but I want some indication, like:
{
_id: someId,
name: 'mike',
hasKeys: true
}
There is built in way to create 'field' on the way based on other fields, or do I need every time fetch the whole document, check if keys is not empty and set object property based on that?
For Mongo version 4.2+ What you're looking for is called pipelined updates, it let's you use a (restricted) aggregate pipeline as your update allowing the usage of existing field values.
Here is a toy example with your data:
db.collection.updateOne(
{ _id: someId },
[
{
"$set": {
"hasKeys": {
$cond: [
{
$ifNull: [
"$keys",
false
]
},
true,
false
]
}
}
},
])
Mongo Playground
For older Mongo versions you have to do it in code.
If you don't want to update the actual document but just populate this field when you fetch it you can use the same aggregation to fetch the document
you can use $project in mongoose aggregation like this.
$project: { hasKeys: { $cond: [{ $eq: ['$keys', null] }, false, true]}}

MongoDB assymetrical return of data, first item in array returned in full, the rest with certain properties omitted?

I'm new to MongoDB and getting to grips with its syntax and capabilities. To achieve the functionality described in the title I believe I can create a promise that will run 2 simultaneous queries on the document - one to get the full content of one item in the array (or at least the data that is omitted in the other query, to re-add after), searched for by most recent date, the other to return the array minus specific properties. I have the following document:
{
_id : ObjectId('5rtgwr6gsrtbsr6hsfbsr6bdrfyb'),
uuid : 'something',
mainArray : [
{
id : 1,
title: 'A',
date: 05/06/2020,
array: ['lots','off','stuff']
},
{
id : 2,
title: 'B',
date: 28/05/2020,
array: ['even','more','stuff']
},
{
id : 3,
title: 'C',
date: 27/05/2020,
array: ['mountains','of','knowledge']
}
]
}
and I would like to return
{
uuid : 'something',
mainArray : [
{
id : 1,
title: 'A',
date: 05/06/2020,
array: ['lots','off','stuff']
},
{
id : 2,
title: 'B'
},
{
id : 3,
title: 'C'
}
]
}
How valid and performant is the promise approach versus constructing one query that would achieve this? I have no idea how to perform such 'combined-rule'/conditions in MongoDB, if anyone could give an example?
If your subdocument array you want to omit is not very large. I would just remove it at the application side. Doing processing in MongoDB means you choose to use the compute resources of MongoDB instead of your application. Generally your application is easier and cheaper to scale, so implementation at the application layer is preferable.
But in this exact case it's not too complex to implement it in MongoDB:
db.collection.aggregate([
{
$addFields: { // keep the first element somewhere
first: { $arrayElemAt: [ "$mainArray", 0] }
}
},
{
$project: { // remove the subdocument field
"mainArray.array": false
}
},
{
$addFields: { // join the first element with the rest of the transformed array
mainArray: {
$concatArrays: [
[ // first element
"$first"
],
{ // select elements from the transformed array except the first
$slice: ["$mainArray", 1, { $size: "$mainArray" }]
}
]
}
}
},
{
$project: { // remove the temporary first elemnt
"first": false
}
}
])
MongoDB Playground

mongodb remove document if array count zero after $pull in a single query

I have a requirement where my comments schema looks like the following
{
"_id": 1,
"comments": [
{ "userId": "123", "comment": "nice" },
{ "userId": "124", "comment": "super"}
]
}
I would like to pull the elements based on the userId field.
I am doing the following query
comments.update({},{$pull:{comments:{userId:"123"}}})
My requirement is that if the array length became zero after the pull operator I need to remove the entire document for some reason.Is there a away to do this in a single query?
PS:I am using the mongodb driver.Not the mongoose
If I'm reading your question right, after the $pull, if the comments array is empty (zero length), then remove the document ({ _id: '', comments: [] }).
This should remove all documents where the comments array exists and is empty:
comments.remove({ comments: { $exists: true, $size: 0 } })
I had a similar requirement and used this (using mongoose though):
await Attributes.update({}, { $pull: { values: { id: { $in: valueIds } } } }, { multi: true })
await Attributes.remove({ values: { $exists: true, $size: 0 } })
Not sure if it's possible to do this in one operation or not.
You can use middlewares for this.
http://mongoosejs.com/docs/middleware.html
Write a pre/post update method in mongodb to check your condition.

how to combine array of object result in mongodb

how can i combine match document's subdocument together as one and return it as an array of object ? i have tried $group but don't seem to work.
my query ( this return array of object in this case there are two )
User.find({
'business_details.business_location': {
$near: coords,
$maxDistance: maxDistance
},
'deal_details.deals_expired_date': {
$gte: new Date()
}
}, {
'deal_details': 1
}).limit(limit).exec(function(err, locations) {
if (err) {
return res.status(500).json(err)
}
console.log(locations)
the console.log(locations) result
// give me the result below
[{
_id: 55 c0b8c62fd875a93c8ff7ea, // first document
deal_details: [{
deals_location: '101.6833,3.1333',
deals_price: 12.12 // 1st deal
}, {
deals_location: '101.6833,3.1333',
deals_price: 34.3 // 2nd deal
}],
business_details: {}
}, {
_id: 55 a79898e0268bc40e62cd3a, // second document
deal_details: [{
deals_location: '101.6833,3.1333',
deals_price: 12.12 // 3rd deal
}, {
deals_location: '101.6833,3.1333',
deals_price: 34.78 // 4th deal
}, {
deals_location: '101.6833,3.1333',
deals_price: 34.32 // 5th deal
}],
business_details: {}
}]
what i wanted to do is to combine these both deal_details field together and return it as an array of object. It will contain 5 deals in one array of object instead of two separated array of objects.
i have try to do it in my backend (nodejs) by using concat or push, however when there's more than 2 match document i'm having problem to concat them together, is there any way to combine all match documents and return it as one ? like what i mentioned above ?
What you are probably missing here is the $unwind pipeline stage, which is what you typically use to "de-normalize" array content, particularly when your grouping operation intends to work across documents in your query result:
User.aggregate(
[
// Your basic query conditions
{ "$match": {
"business_details.business_location": {
"$near": coords,
"$maxDistance": maxDistance
},
"deal_details.deals_expired_date": {
"$gte": new Date()
}},
// Limit query results here
{ "$limit": limit },
// Unwind the array
{ "$unwind": "$deal_details" },
// Group on the common location
{ "$group": {
"_id": "$deal_details.deals_location",
"prices": {
"$push": "$deal_details.deals_price"
}
}}
],
function(err,results) {
if (err) throw err;
console.log(JSON.stringify(results,undefined,2));
}
);
Which gives output like:
{
"_id": "101.6833,3.1333",
"prices": [
12.12,
34.3,
12.12,
34.78,
34.32
]
}
Depending on how many documents actually match the grouping.
Alternately, you might want to look at the $geoNear pipeline stage, which gives a bit more control, especially when dealing with content in arrays.
Also beware that with "location" data in an array, only the "nearest" result is being considered here and not "all" of the array content. So other items in the array may not be actually "near" the queried point. That is more of a design consideration though as any query operation you do will need to consider this.
You can merge them with reduce:
locations = locations.reduce(function(prev, location){
previous = prev.concat(location.deal_details)
return previous
},[])

How to make a query using Mongoose that gets N results, but combines any documents it finds that meet certain criteria?

I have a Comments collection in Mongoose, and a query that returns the most recent five (an arbitrary number) Comments.
Every Comment is associated with another document. What I would like to do is make a query that returns the most recent 5 comments, with comments associated with the same other document combined.
So instead of a list like this:
results = [
{ _id: 123, associated: 12 },
{ _id: 122, associated: 8 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
{ _id: 119, associated: 17 }
]
I'd like to return a list like this:
results = [
{ _id: 124, associated: 3 },
{ _id: 125, associated: 19 },
[
{ _id: 123, associated: 12 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
],
{ _id: 122, associated: 8 },
{ _id: 119, associated: 17 }
]
Please don't worry too much about the data format: it's just a sketch to try to show the sort of thing I want. I want a result set of a specified size, but with some results grouped according to some criterion.
Obviously one way to do this would be to just make the query, crawl and modify the results, then recursively make the query again until the result set is as long as desired. That way seems awkward. Is there a better way to go about this? I'm having trouble phrasing it in a Google search in a way that gets me anywhere near anyone who might have insight.
Here's an aggregation pipeline query that will do what you are asking for:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 }
])
Lacking any other fields from the sample data to sort by, I used $_id.
If you'd like results that are a little closer in structure to the sample result set you provided you could add a $project to the end:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will print only the result set. Note that even comments that do not share an association object will be in an array. It will be an array of 1 length.
If you are concerned about limiting the results in the grouping as Neil Lunn is suggesting, perhaps a $match in the beginning is a smart idea.
db.comments.aggregate([
{ $match: { createDate: { $gte: new Date(new Date() - 5 * 60000) } } },
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will only include comments made in the last 5 minutes assuming you have a createDate type field. If you do, you might also consider using that as the field to sort by instead of "_id". If you do not have a createDate type field, I'm not sure how best to limit the comments that are grouped as I do not know of a "current _id" in the way that there is a "current time".
I honestly think you are asking a lot here and cannot really see the utility myself, but I'm always happy to have that explained to me if there is something useful I have missed.
Bottom line is you want comments from the last five distinct users by date, and then some sort of grouping of additional comments by those users. The last part is where I see difficulty in rules no matter how you want to attack this, but I'll try to keep this to the most brief form.
No way this happens in a single query of any sort. But there are things that can be done to make it an efficient server response:
var DataStore = require('nedb'),
store = new DataStore();
async.waterfall(
function(callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId } },
{ "$sort": { "associated": 1, "createdDate": -1 } },
{ "$group": {
"_id": "$associated",
"date": { "$first": "$createdDate" }
}},
{ "$sort": { "date": -1 } },
{ "$limit": 5 }
],
callback);
},
function(docs,callback) {
async.each(docs,function(doc,callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId, "associated": doc._id } },
{ "$sort": { "createdDate": -1 } },
{ "$limit": 5 },
{ "$group": {
"_id": "$associated",
"docs": {
"$push": {
"_id": "$_id", "createdDate": "$createdDate"
}
},
"firstDate": { "$first": "$createdDate" }
}}
],
function(err,results) {
if (err) callback(err);
async.each(results,function(result,callback) {
store.insert( result, function(err, result) {
callback(err);
});
},function(err) {
callback(err);
});
}
);
},
callback);
},
function(err) {
if (err) throw err;
store.find({}).sort({ "firstDate": - 1 }).exec(function(err,docs) {
if (err) throw err;
console.log( JSON.stringify( docs, undefined, 4 ) );
});
}
);
Now I stuck more document properties in both the document and the array, but the simplified form based on your sample would then come out like this:
results = [
{ "_id": 3, "docs": [124] },
{ "_id": 19, "docs": [125] },
{ "_id": 12, "docs": [123,121,120] },
{ "_id": 8, "docs": [122] },
{ "_id": 17, "docs": [119] }
]
So the essential idea is to first find your distinct "users" who where the last to comment by basically chopping off the last 5. Without filtering some kind of range here that would go over the entire collection to get those results, so it would be best to restrict this in some way, as in the last hour or last few hours or something sensible as required. Just add those conditions to the $match along with the current post that is associated with the comments.
Once you have those 5, then you want to get any possible "grouped" details for multiple comments by those users. Again, some sort of limit is generally advised for a timeframe, but as a general case this is just looking for the most recent comments by the user on the current post and restricting that to 5.
The execution here is done in parallel, which will use more resources but is fairly effective considering there are only 5 queries to run anyway. In contrast to your example output, the array here is inside the document result, and it contains the original document id values for each comment for reference. Any other content related to the document would be pushed into the array as well as required (ie The content of the comment).
The other little trick here is using nedb as a means for storing the output of each query in an "in memory" collection. This need only really be a standard hash data structure, but nedb gives you a way of doing that while maintaining the MongoDB statement form that you may be used to.
Once all results are obtained you just return them as your output, and sorted as shown to retain the order of who commented last. The actual comments are grouped in the array for each item and you can traverse this to output how you like.
Bottom line here is that you are asking for a compounded version of the "top N results problem", which is something often asked of MongoDB. I've written about ways to tackle this before to show how it's possible in a single aggregation pipeline stage, but it really is not practical for anything more than a relatively small result set.
If you really want to join in the insanity, then you can look at Mongodb aggregation $group, restrict length of array for one of the more detailed examples. But for my money, I would run on parallel queries any day. Node.js has the right sort of environment to support them, so you would be crazy to do it otherwise.

Resources