Is there a non strict $nin version in mongodb? for example
Let's say that we have a model called User and a Model called task
var TaskSchema = new Schema({
user_array: [{user: Schema.ObjectId}],
});
A quick sample would be this
task1 : [user1, user2, user4, user7]
task2 : [user2, user 5, user7]
if I have a list of user
[user1, user7]
I want to select the task that has the least overlapping in the user_array, in this case task2, I know $nin strictly returns the task that contains neither user1 or user7, but I would like to know if there are operation where $nin is non strict.
Alternatively, I could have write a DP function to this for me
Any advice would be appreciated
Thanks
Well in MongoDB version 2.6 and upwards you have the $setIntersection and $size operators available so you can perform an .aggregate() statement like this:
db.collection.aggregate([
{ "$project": {
"user_array": 1,
"size": { "$size": {
"$setIntersection": [ "$user_array", [ "user1", "user7" ] ]
}}
}},
{ "$match": { "size": { "$gt": 1 } },
{ "$sort": { "size": 1 }},
{ "$group": {
"_id": null
"user_array": { "$first": "$user_array" }
}}
])
So those operators help to reduce the steps required to find the least matching document.
Basically the $setIntersection returns the matching elements in the array to the one it is being compared with. The $size operator returns the "size" of that resulting array. So later you filter out with $match any documents where neither of the items in the matching list were found in the array.
Finally you just sort and return the item with the "least" matches.
But it can still be done in earlier versions with some more steps. So basically your "non-strict" implementation becomes an $or condition. But of course you still need to count the matches:
db.collection.aggregate([
{ "$project": {
"_id": {
"_id": "$_id",
"user_array": "$user_array"
},
"user_array": 1
}}
{ "$unwind": "$user_array" },
{ "$match": {
"$or": [
{ "user_array": "user1" },
{ "user_array": "user7" }
]
}},
{ "$group": {
"_id": "$_id",
"size": { "$sum": 1 }
}},
{ "$sort": { "size": 1 } },
{ "$group": {
"_id": null,
"user_array": { "$first": "$_id.user_array" }
}}
])
And that would do the same thing.
Related
So what I want to do is group all documents having same hash whose count is more than 1 and only keep the oldest record according to startDate
My db structure is as follows:
[{
"_id": "82bacef1915f4a75e6a18406",
"Hash": "cdb3d507734383260b1d26bd3edcdfac",
"duration": 12,
"price": 999,
"purchaseType": "Complementary",
"startDate": {
"$date": {
"$numberLong": "1656409841000"
}
},
"endDate": {
"$date": {
"$numberLong": "1687859441000"
}
}
}]
I was using this query which I created
db.Mydb.aggregate([
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
}
},{"$sort":{startDate:-1}},
{
"$match": {
count: { "$gt": 1 }
}
}
]).forEach(function(doc) {
doc.dups.shift();
db.Mydb.deleteMany({
_id: {$in: doc.dups}
});
})
this gives a result like this:
{ _id: { hash: '1c01ef475d072f207c4485d0a6448334' },
dups:
[ '6307501ca03c94389f09b782',
'6307501ca03c94389f09b783',
'62bacef1915f4a75e6a18l06' ],
count: 3 }
The problem with this is that the _id's in dups array are random everytime I run this query i.e. not sorted according to startDate field.
What can be done here?
Any help is appreciated. Thanks!
After $group stage, startDate field will not pre present in the results, so you can not sort based on that field. So, as stated in the comments, you should put $sort stage first in the Aggregation pipeline.
db.Mydb.aggregate([
{
"$sort": { startDate: -1}
},
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
},
{
"$match": { count: { "$gt": 1 }
}
]
Got the solution. I was using $addToSet in the group pipeline stage which does not allow duplicate values. Instead, I used $push which allows duplicate elements in the array or set.
I have a mongoose document having the following Schema:
Products
{
"section":"",
"category":"Food & Drink",
"sub_category":"Main Dish",
"product_code":"ST",
"title":"Steak",
"description":"Served with sauted vegetables",
"tags":[
],
"warranty":"None",
"product_variants":[
{
"variant_code":"ST1",
"variant_title":"Rib Eye",
"images":[
],
"status":"Active",
"variant_details":[
{
"size":"6oz",
"local_price":800,
"local_discount":"0",
"foreign_price":0,
"foreign_discount":"0",
"inventory":[
{
"branch_id":{
},
"quantity":94
}
]
},
{
"size":"10oz",
"local_price":1000,
"local_discount":"0",
"foreign_price":0,
"foreign_discount":"0",
"inventory":[
{
"branch_id":{
},
"quantity":147
}
]
},
{
"size":"12oz",
"local_price":1200,
"local_discount":"0",
"foreign_price":0,
"foreign_discount":"0",
"inventory":[
{
"branch_id":{
},
"quantity":199
}
]
}
]
}
]
}
The above document shows only one object in the product_variants field but please note that there could be several objects as well. I need to sum the quantity for each size and product variant.
How would I do that using aggregate function? I am using mongoose in node js environment.
Query
(its based on the last comment in the previous answer, similar query but multiplies that quantity with the local price)
Test code here
db.collection.aggregate([
{
"$unwind": "$product_variants"
},
{
"$unwind": "$product_variants.variant_details"
},
{
"$unwind": "$product_variants.variant_details.inventory"
},
{
"$set": {
"total_local_price": {
"$multiply": [
"$product_variants.variant_details.inventory.quantity",
"$product_variants.variant_details.local_price"
]
}
}
},
{
$group: {
_id: null, // or "$_id" if you want only for 1 document
total_qty: {
$sum: "$total_local_price"
}
}
}
])
You can use this aggregation query:
Fisrt $project to get only the quantity values. It generates the following output:
"array": [
[
[
94
],
[
147
],
[
199
]
]
So next step is to use $unwind three times to flat the array.
And $group by _id using $sum
yourModel.aggregate([{
"$project": {
"array": "$product_variants.variant_details.inventory.quantity"
}
},
{
"$unwind": "$array"
},
{
"$unwind": "$array"
},
{
"$unwind": "$array"
},
{
"$group": {
"_id": "$_id",
"size": {
"$sum": "$array"
}
}
}])
Example here
Edit
As Takis _ suggested into the comments if you want to get all values from your entire collection (not only for each document) you can $group using null as this example
I have 3 arrays of ObjectIds I want to concatenate into a single array, and then sort by creation date. $setUnion does precisely what I want, but I'd like to try without using it.
Schema of object I want to sort:
var chirpSchema = new mongoose.Schema({
interactions: {
_liked : ["55035390d3e910505be02ce2"] // [{ type: $oid, ref: "interaction" }]
, _shared : ["507f191e810c19729de860ea", "507f191e810c19729de860ea"] // [{ type: $oid, ref: "interaction" }]
, _viewed : ["507f1f77bcf86cd799439011"] // [{ type: $oid, ref: "interaction" }]
}
});
Desired result: Concatenate _liked, _shared, and _viewed into a single array, and then sort them by creation date using aggregate pipeline. See below
["507f1f77bcf86cd799439011", "507f191e810c19729de860ea", "507f191e810c19729de860ea", "55035390d3e910505be02ce2"]
I know I'm suppose to use $push, $each, $group, and $unwind in some combination or other, but I'm having trouble piecing together the documenation to make this happen.
Update: Query
model_user.aggregate([
{ $match : { '_id' : { $in : following } } }
, { $project : { 'interactions' : 1 } }
, { $project : {
"combined": { $setUnion : [
"$interactions._liked"
, "$interactions._shared"
, "$interactions._viewed"
]}
}}
])
.exec(function (err, data) {
if (err) return next(err);
next(data); // Combined is returning null
})
If all the Object _id values are "unique" then $setUnion is your best option. It is of course not "ordered" in any way as it works with a "set", and that does not guarantee order. But you can always unwind and $sort.
[
{ "$project": {
"combined": { "$setUnion": [
{ "$ifNull": [ "$interactions._liked", [] ] },
{ "$ifNull": [ "$interactions._shared", [] ] },
{ "$ifNull", [ "$interactions._viewed", [] ] }
]}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
Of course again since this is a "set" of distinct values you can do the old way instead with $addToSet, after processing $unwind on each array:
[
{ "$unwind": "$interactions._liked" },
{ "$unwind": "$interactions._shared" },
{ "$unwind": "$interactions._viewed" },
{ "$project": {
"interactions": 1,
"type": { "$const": [ "liked", "shared", "viewed" ] }
}}
{ "$unwind": "$type" },
{ "$group": {
"_id": "$_id",
"combined": {
"$addToSet": {
"$cond": [
{ "$eq": [ "$type", "liked" ] },
"$interactions._liked",
{ "$cond": [
{ "$eq": [ "$type", "shared" ] },
"$interactions._shared",
"$interactions._viewed"
]}
]
}
}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
But still the same thing applies to ordering.
Future releases even have the ability to concatenate arrays without reducing to a "set":
[
{ "$project": {
"combined": { "$concatArrays": [
"$interactions._liked",
"$interactions._shared",
"$interactions._viewed"
]}
}},
{ "$unwind": "$combined" },
{ "$sort": { "combined": 1 } },
{ "$group": {
"_id": "$_id",
"combined": { "$push": "$combined" }
}}
]
But still there is no way to re-order the results without procesing $unwind and $sort.
You might therefore consider that unless you need this grouped across multiple documents, that the basic "contenate and sort" operation is best handled in client code. MongoDB has no way to do this "in place" on the array at present, so per document in client code is your best bet.
But if you do need to do this grouping over multiple documents, then the sort of approaches as shown here are for you.
Also note that "creation" here means creation of the ObjectId value itself and not other properties from your referenced objects. If you need those, then you perform a populate on the id values after the aggregation or query instead, and of course sort in client code.
Here is an example of my Schema with some data:
client {
menus: [{
sections: [{
items: [{
slug: 'some-thing'
}]
}]
}]
}
And I am trying to select it like this:
Schema.findOne({ client._id: id, 'menus.sections.items.slug': 'some-thing' }).select('menus.sections.items.$').exec(function(error, docs){
console.log(docs.menus[0].sections[0].items[0].slug);
});
Of course "docs.menus[0].sections[0].items[0].slug" only works if there is only one thing in each array. How can I make this work if there is multiple items in each array without having to loop through everything to find it?
If you need more details let me know.
The aggregation framework is good for finding things in deeply nested arrays where the positional operator will fail you:
Model.aggregate(
[
// Match the "documents" that meet your criteria
{ "$match": {
"menus.sections.items.slug": "some-thing"
}},
// Unwind the arrays to de-normalize as documents
{ "$unwind": "$menus" },
{ "$unwind": "$menus.sections" },
{ "$unwind": "$menus.sections.items" }
// Match only the element(s) that meet the criteria
{ "$match": {
"menus.sections.items.slug": "some-thing"
}}
// Optionally group everything back to the nested array
// One step at a time
{ "$group": {
"_id": "$_id",
"items": { "$push": "$menus.sections.items.slug" }
}},
{ "$group": {
"_id": "$_id",
"sections": {
"$push": { "items": "$items" }
}
}},
{ "$group": {
"_id": "$_id",
"menus": {
"$push": { "sections": "$sections" }
}
}},
],
function(err,results) {
}
)
Also see the other aggregation operators such as $first for keeping other fields in your document when using $group.
If I have a collection with documents looking more or less like this:
{
"_id": "52c06c86b78a091f26000001",
"desc": "Something somethingson",
"likes": [
{
"user_id": "52add4f4344e8ca867000001",
"username": "asd",
"created": 1390652212592
},
{
"user_id": "52add4f4344e8ca867000001",
"username": "asd",
"created": 1390652212592
}
],
"user_id": "52add4f4344e8ca867000001",
"username":"username"
}
Could I with mongodb (using nodejs & express) get a list of 10 users with the most liked posts?
I'm thinking it should be possible to do this Map-Reduce or with Aggregate, by grouping all posts by "user_id" and then count the total amount of items in the "likes" array. And then get the top 10 "user_id"s from that.
I'm guessing this is possible. Only problem is that I dont get the examples from the mongo website enough to put together my own thing.
Could anyone show me an example of doing this, or at least tell me its impossible if it is. =)
Try that out...
db.collection.aggregate([
{$unwind: "$likes"},
{$group: {_id: "$user_id", likesCount: {$sum: 1}}},
{$sort: {likesCount: -1}},
{$limit: 10},
{$project: {_id: 1}}
])
See this tutorial.
Yes, it is possible:
db.coll.aggregate([
{
$unwind: "$likes"
},
{
$group: {
_id: "$user_id",
likesPerUser: {
$sum:1
}
}
}
{
$sort: {
likesPerUser: -1
};
}
{
$limit: 10
}
])
If you need to count the number of likes element per parent user, you can do the following:
http://mongotry.herokuapp.com/#?bookmarkId=52fb8c604e86f9020071ed71
[
{
"$unwind": "$likes"
},
{
"$group": {
"_id": "$user_id",
"likesPerUser": {
"$sum": 1
}
}
},
{
"$sort": {
"likesPerUser": -1
}
},
{
"$limit": 10
}
]
Else if you need the number of liked user-id, counting 1 per sub-document cross all parents, you can do the following:
http://mongotry.herokuapp.com/#?bookmarkId=52fb8cf74e86f9020071ed72
[
{
"$unwind": "$likes"
},
{
"$group": {
"_id": "$likes.user_id",
"likesPerUser": {
"$sum": 1
}
}
},
{
"$sort": {
"likesPerUser": -1
}
},
{
"$limit": 1
}
]