If I have a collection with documents looking more or less like this:
{
"_id": "52c06c86b78a091f26000001",
"desc": "Something somethingson",
"likes": [
{
"user_id": "52add4f4344e8ca867000001",
"username": "asd",
"created": 1390652212592
},
{
"user_id": "52add4f4344e8ca867000001",
"username": "asd",
"created": 1390652212592
}
],
"user_id": "52add4f4344e8ca867000001",
"username":"username"
}
Could I with mongodb (using nodejs & express) get a list of 10 users with the most liked posts?
I'm thinking it should be possible to do this Map-Reduce or with Aggregate, by grouping all posts by "user_id" and then count the total amount of items in the "likes" array. And then get the top 10 "user_id"s from that.
I'm guessing this is possible. Only problem is that I dont get the examples from the mongo website enough to put together my own thing.
Could anyone show me an example of doing this, or at least tell me its impossible if it is. =)
Try that out...
db.collection.aggregate([
{$unwind: "$likes"},
{$group: {_id: "$user_id", likesCount: {$sum: 1}}},
{$sort: {likesCount: -1}},
{$limit: 10},
{$project: {_id: 1}}
])
See this tutorial.
Yes, it is possible:
db.coll.aggregate([
{
$unwind: "$likes"
},
{
$group: {
_id: "$user_id",
likesPerUser: {
$sum:1
}
}
}
{
$sort: {
likesPerUser: -1
};
}
{
$limit: 10
}
])
If you need to count the number of likes element per parent user, you can do the following:
http://mongotry.herokuapp.com/#?bookmarkId=52fb8c604e86f9020071ed71
[
{
"$unwind": "$likes"
},
{
"$group": {
"_id": "$user_id",
"likesPerUser": {
"$sum": 1
}
}
},
{
"$sort": {
"likesPerUser": -1
}
},
{
"$limit": 10
}
]
Else if you need the number of liked user-id, counting 1 per sub-document cross all parents, you can do the following:
http://mongotry.herokuapp.com/#?bookmarkId=52fb8cf74e86f9020071ed72
[
{
"$unwind": "$likes"
},
{
"$group": {
"_id": "$likes.user_id",
"likesPerUser": {
"$sum": 1
}
}
},
{
"$sort": {
"likesPerUser": -1
}
},
{
"$limit": 1
}
]
Related
So what I want to do is group all documents having same hash whose count is more than 1 and only keep the oldest record according to startDate
My db structure is as follows:
[{
"_id": "82bacef1915f4a75e6a18406",
"Hash": "cdb3d507734383260b1d26bd3edcdfac",
"duration": 12,
"price": 999,
"purchaseType": "Complementary",
"startDate": {
"$date": {
"$numberLong": "1656409841000"
}
},
"endDate": {
"$date": {
"$numberLong": "1687859441000"
}
}
}]
I was using this query which I created
db.Mydb.aggregate([
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
}
},{"$sort":{startDate:-1}},
{
"$match": {
count: { "$gt": 1 }
}
}
]).forEach(function(doc) {
doc.dups.shift();
db.Mydb.deleteMany({
_id: {$in: doc.dups}
});
})
this gives a result like this:
{ _id: { hash: '1c01ef475d072f207c4485d0a6448334' },
dups:
[ '6307501ca03c94389f09b782',
'6307501ca03c94389f09b783',
'62bacef1915f4a75e6a18l06' ],
count: 3 }
The problem with this is that the _id's in dups array are random everytime I run this query i.e. not sorted according to startDate field.
What can be done here?
Any help is appreciated. Thanks!
After $group stage, startDate field will not pre present in the results, so you can not sort based on that field. So, as stated in the comments, you should put $sort stage first in the Aggregation pipeline.
db.Mydb.aggregate([
{
"$sort": { startDate: -1}
},
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
},
{
"$match": { count: { "$gt": 1 }
}
]
Got the solution. I was using $addToSet in the group pipeline stage which does not allow duplicate values. Instead, I used $push which allows duplicate elements in the array or set.
I'm trying to group values together in mongoose. I have a "Review" schema with the following fields:
{ userId, rating, comment }
There are many documents with the same userId. How can I retrieve them in the following format:
{userId: [...allRatings]
Or even better, is there a way to retrieve the averages for each userId? so like this: {userId: 2.8}
I know it's possible and very simple to do in node.js, but is there a way of doing it with mongoose?
Mongoose is really just a vehicle to pass commands to your mondoDB server, so accomplishing what you want in mongoose isn't dissimilar to accomplishing it in the mongo shell.
Here is the aggregation you're looking for:
db.collection.aggregate([
{
"$group": {
"_id": "$userId",
"ratings": {
$push: "$rating"
}
}
},
{
"$project": {
"_id": false,
"userId": "$_id",
"avgRating": {
"$avg": "$ratings"
}
}
}
])
The first stage of the pipeline groups all ratings by useId. The second stage calculates the ratings average and pretties up the key display. That's it. The result will be this:
[
{
"avgRating": 2.8,
"userId": 110
},
{
"avgRating": 3.275,
"userId": 100
}
]
Here is a playground for you: https://mongoplayground.net/p/yXVxk4klabB
As for how to specifically run this command in mongoose, well that's pretty straightforward:
const YourModel = mongoose.model('your_model');
...
YourModel.aggregate([
{
"$group": {
"_id": "$userId",
"ratings": {
$push: "$rating"
}
}
},
{
"$project": {
"_id": false,
"userId": "$_id",
"avgRating": {
"$avg": "$ratings"
}
}
}
])
.then(result => {
console.log(result);
})
I have used SQL Server for a long time and just really learning MongoDB. I am trying to figure out how to do the aggregate finds to get just the data I want. Here is a sample from the database:
{
"_id": "1",
"user_id": "123-88",
"department_id": "1",
"type": "start",
"time": "2017-04-20T19:40:15.329Z"
}
{
"_id": "2",
"user_id": "123-88",
"department_id": "1",
"type": "stop",
"time": "2017-04-20T19:47:15.329Z"
}
What I want to do is find each unique user_id of department 1, only take the record with the latest time and tell me if they are oncall or not. So in the example above user 123-88 is not oncall now. How would you make this query? I know you will need something like this:
TimeCard.aggregate([
{ $match: {department_id: req.query.department_id}},
{ $sort: { user_id: 1, time: 1 }},
{ $group: { _id: "$user_id", current_type: "$type", lastTime: { $last: "$time" }}}
], function(err, docs){
if(err){console.log(err);}
else {res.json(docs);}
});
But I keep erroring so I know I am not correct in my logic. I know if I just have the match it works and if I add the sort it matches and sorts but the final group is not working. Also what would I add to then only show the people that are oncall. Thanks again for your help.
You can count how many "types" per user_id you have, this can be done by $sum, if it's odd, the user is oncall, because there is a start without a stop. This approach is correct only if you always have a stop for a start.
TimeCard.aggregate([
{ $match: { department_id: req.query.department_id } },
{ $sort: { user_id: 1, time: 1 } },
{ $group: { _id: "$user_id", count_types: { $sum: 1 }, lastTime: { $last: "$time" }}},
{ $match: { count_types: { $mod: [ 2, 1 ] } } },
], function(err, docs) {
if(err) { console.log(err); }
else { res.json(docs); }
});
I have 3 arrays of ObjectIds I want to concatenate into a single array, and then sort by creation date. $setUnion does precisely what I want, but I'd like to try without using it.
Schema of object I want to sort:
var chirpSchema = new mongoose.Schema({
interactions: {
_liked : ["55035390d3e910505be02ce2"] // [{ type: $oid, ref: "interaction" }]
, _shared : ["507f191e810c19729de860ea", "507f191e810c19729de860ea"] // [{ type: $oid, ref: "interaction" }]
, _viewed : ["507f1f77bcf86cd799439011"] // [{ type: $oid, ref: "interaction" }]
}
});
Desired result: Concatenate _liked, _shared, and _viewed into a single array, and then sort them by creation date using aggregate pipeline. See below
["507f1f77bcf86cd799439011", "507f191e810c19729de860ea", "507f191e810c19729de860ea", "55035390d3e910505be02ce2"]
I know I'm suppose to use $push, $each, $group, and $unwind in some combination or other, but I'm having trouble piecing together the documenation to make this happen.
Update: Query
model_user.aggregate([
{ $match : { '_id' : { $in : following } } }
, { $project : { 'interactions' : 1 } }
, { $project : {
"combined": { $setUnion : [
"$interactions._liked"
, "$interactions._shared"
, "$interactions._viewed"
]}
}}
])
.exec(function (err, data) {
if (err) return next(err);
next(data); // Combined is returning null
})
If all the Object _id values are "unique" then $setUnion is your best option. It is of course not "ordered" in any way as it works with a "set", and that does not guarantee order. But you can always unwind and $sort.
[
{ "$project": {
"combined": { "$setUnion": [
{ "$ifNull": [ "$interactions._liked", [] ] },
{ "$ifNull": [ "$interactions._shared", [] ] },
{ "$ifNull", [ "$interactions._viewed", [] ] }
]}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
Of course again since this is a "set" of distinct values you can do the old way instead with $addToSet, after processing $unwind on each array:
[
{ "$unwind": "$interactions._liked" },
{ "$unwind": "$interactions._shared" },
{ "$unwind": "$interactions._viewed" },
{ "$project": {
"interactions": 1,
"type": { "$const": [ "liked", "shared", "viewed" ] }
}}
{ "$unwind": "$type" },
{ "$group": {
"_id": "$_id",
"combined": {
"$addToSet": {
"$cond": [
{ "$eq": [ "$type", "liked" ] },
"$interactions._liked",
{ "$cond": [
{ "$eq": [ "$type", "shared" ] },
"$interactions._shared",
"$interactions._viewed"
]}
]
}
}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
But still the same thing applies to ordering.
Future releases even have the ability to concatenate arrays without reducing to a "set":
[
{ "$project": {
"combined": { "$concatArrays": [
"$interactions._liked",
"$interactions._shared",
"$interactions._viewed"
]}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
But still there is no way to re-order the results without procesing $unwind and $sort.
You might therefore consider that unless you need this grouped across multiple documents, that the basic "contenate and sort" operation is best handled in client code. MongoDB has no way to do this "in place" on the array at present, so per document in client code is your best bet.
But if you do need to do this grouping over multiple documents, then the sort of approaches as shown here are for you.
Also note that "creation" here means creation of the ObjectId value itself and not other properties from your referenced objects. If you need those, then you perform a populate on the id values after the aggregation or query instead, and of course sort in client code.
Is there a non strict $nin version in mongodb? for example
Let's say that we have a model called User and a Model called task
var TaskSchema = new Schema({
user_array: [{user: Schema.ObjectId}],
});
A quick sample would be this
task1 : [user1, user2, user4, user7]
task2 : [user2, user 5, user7]
if I have a list of user
[user1, user7]
I want to select the task that has the least overlapping in the user_array, in this case task2, I know $nin strictly returns the task that contains neither user1 or user7, but I would like to know if there are operation where $nin is non strict.
Alternatively, I could have write a DP function to this for me
Any advice would be appreciated
Thanks
Well in MongoDB version 2.6 and upwards you have the $setIntersection and $size operators available so you can perform an .aggregate() statement like this:
db.collection.aggregate([
{ "$project": {
"user_array": 1,
"size": { "$size": {
"$setIntersection": [ "$user_array", [ "user1", "user7" ] ]
}}
}},
{ "$match": { "size": { "$gt": 1 } },
{ "$sort": { "size": 1 }},
{ "$group": {
"_id": null
"user_array": { "$first": "$user_array" }
}}
])
So those operators help to reduce the steps required to find the least matching document.
Basically the $setIntersection returns the matching elements in the array to the one it is being compared with. The $size operator returns the "size" of that resulting array. So later you filter out with $match any documents where neither of the items in the matching list were found in the array.
Finally you just sort and return the item with the "least" matches.
But it can still be done in earlier versions with some more steps. So basically your "non-strict" implementation becomes an $or condition. But of course you still need to count the matches:
db.collection.aggregate([
{ "$project": {
"_id": {
"_id": "$_id",
"user_array": "$user_array"
},
"user_array": 1
}}
{ "$unwind": "$user_array" },
{ "$match": {
"$or": [
{ "user_array": "user1" },
{ "user_array": "user7" }
]
}},
{ "$group": {
"_id": "$_id",
"size": { "$sum": 1 }
}},
{ "$sort": { "size": 1 } },
{ "$group": {
"_id": null,
"user_array": { "$first": "$_id.user_array" }
}}
])
And that would do the same thing.