MongoDB query - if value appears in array x times - node.js

{
"_id": ObjectId("53ab1d2c256072374a5cc63f"),
"title": "10% Off",
"endDate": "2015-05-08",
"limit" : "limited",
"redemptions": [
"1f7f5f96be3a",
"kf40vksk03ps"
]
}
{
"_id": ObjectId("53ab1d2c25607sfdgs74a5cc63f"),
"title": "20% Off",
"endDate": "2015-06-07",
"limit" : "unlimited",
"redemptions": [
"1f7f5f96be3a",
"1f7f5f96be3a",
"kf40vksk03ps"
]
}
Story: a person can redeem a coupon 2 times. After 2 times, don't return it.
How can I check that a value appears less than 2 times??
Wish it was as easy as:
{ 'redemptions' : { $exists: true }, $where : 'this.redemptions.$.1f7f5f96be3a.length < 2' }
How can I get a count for how many times a specific value is in an array and compare on that?
Edit
So to add some fun. I updated my schema, so I need to put that into a conditional. If limit = 'unlimited' { return record } if limit = 'limited' { return only if array has less than 2 values = '1f7f5f96be3a'

You can do what you want with aggregation framework:
db.collection.aggregate([
/* find only documents that have redemption key */
{ $match : { redemptions : { $exists: true }}},
/* unwind the redemptions array so we can filter documents by coupons */
{ $unwind : "$redemptions" },
/* filter only coupons you're looking for */
{ $match : { redemptions : "1f7f5f96be3a"}},
/* group the documents back so we can calculate the count */
{ $group : { _id : "$_id",
title : { $first : "$title" },
endDate : { $first : "$endDate" },
count : {$sum : 1 }}},
/* finally, count the documents that have less than 2 coupons */
{ $match : { count : { $lt :2 }}}
]);
Edit:
You just need to change the $group and last $match stages:
{ $group : { _id : "$_id",
title : { $first : "$title" },
endDate : { $first : "$endDate" },
limit : { $first : "$limit" },
count : {$sum : 1 }}},
/* finally, count the documents that have less than 2 coupons
or have limit "unlimited */
{ $match : { $or : [{ count : { $lt :2 }}, {limit : "unlimited"}]}}

Related

Mongo aggregation $count pipeline to return count and objects [duplicate]

I want to perform an aggregation query that does basic pagination:
Find all orders that belongs to a certain company_id
Sort the orders by order_number
Count the total number of documents
Skips to e.g. document number 100 and passes on the rest
Limits the number of documents to e.g. 2 and passes them on
Finishes by returning the count and a selected few fields from the documents
Here is a breakdown of the query:
db.Order.collection.aggregate([
This finds all matching documents:
{ '$match' : { "company_id" : ObjectId("54c0...") } },
This sorts the documents:
{ '$sort' : { 'order_number' : -1 } },
This counts the documents and passes the unmodified documents, but I'm sure doing it wrong, because things turn weird from here:
{
'$group' : {
'_id' : null,
'count' : { '$sum' : 1 },
'entries' : { '$push' : "$$ROOT" }
}
},
This seems to skip some documents:
{ "$skip" : 100 },
This is supposed to limit the documents, but it does not:
{ "$limit" : 2 },
This does return the count, but it does not return the documents in an array, instead it returns arrays with each field:
{ '$project' : {
'count' : 1,
'entries' : {'_id' : "$entries._id", 'order_number' : "$entries.order_number"}
}
}
])
This is the result:
[
{ "_id" : null,
"count" : 300,
"entries" : [
{
"_id" : [ObjectId('5a5c...'), ObjectId('5a5c...')],
"order_number" : ["4346", "4345"]
},
{
"_id" : [ObjectId('5a5c...'), ObjectId('5a5c...')],
"order_number" : ["4346", "4345"]
},
...
]
}
]
Where do I get it wrong?
To calculate totals and return a subset, you need to apply grouping and skip/limit to the same dataset. For that you can utilise facets
For example to show 3rd page, 10 documents per page:
db.Order.aggregate([
{ '$match' : { "company_id" : ObjectId("54c0...") } },
{ '$sort' : { 'order_number' : -1 } },
{ '$facet' : {
metadata: [ { $count: "total" }, { $addFields: { page: NumberInt(3) } } ],
data: [ { $skip: 20 }, { $limit: 10 } ] // add projection here wish you re-shape the docs
} }
] )
It will return a single document with 2 fields:
{
"metadata" : [
{
"total" : 300,
"page" : 3
}
],
"data" : [
{
... original document ...
},
{
... another document ...
},
{
... etc up to 10 docs ...
}
]
}
Since mongoDB version 5.0 there is another option, that allows to avoid the disadvantage of $facet, the grouping of all returned document into a one big document. The main concern is that a document as a size limit of 16M. Using $setWindowFields allows to avoid this concern:
db.Order.aggregate([
{$match: {company_id: ObjectId("54c0...") } },
{$sort: {order_number: -1 } },
{$setWindowFields: {output: {totalCount: {$count: {}}}}}
{$skip: 20 },
{$limit: 10 }
])

MongoDB query to get the sum of all document's array field length

Below is the sample document of a collection, say "CollectionA"
{
"_id" : ObjectId("5ec3f19225701c4f7ab11a5f"),
"workshop" : ObjectId("5ebd37a3d33055331eb4730f"),
"participant" : ObjectId("5ebd382dd33055331eb47310"),
"status" : "analyzed",
"createdBy" : ObjectId("5eb7aa24d33055331eb4728c"),
"updatedBy" : ObjectId("5eb7aa24d33055331eb4728c"),
"results" : [
{
"analyze_by" : {
"user_name" : "m",
"user_id" : "5eb7aa24d33055331eb4728c"
},
"category_list" : [
"Communication",
"Controlling",
"Leading",
"Organizing",
"Planning",
"Staffing"
],
"analyzed_date" : ISODate("2020-05-19T14:48:49.993Z"),
}
],
"summary" : [],
"isDeleted" : false,
"isActive" : true,
"updatedDate" : ISODate("2020-05-19T14:48:50.827Z"),
"createdDate" : ISODate("2020-05-19T14:47:46.374Z"),
"__v" : 0
}
I need to query all the documents to get the "results" array length and return a sum of all document's "results" length.
For example,
document 1 has "results" length - 5
document 2 has "results" length - 6
then output should be 11.
Can we write a query, instead of getting all, iterating and the adding the results length??
If I had understand clearly you would like to project the length of the result attribute.
So you should check the $size operator would work for you.
https://docs.mongodb.com/manual/reference/operator/aggregation/size/
You can use $group and $sum to calculate the total size of a field which contains the size of your results array. To create the field, You can use $size in $addFields to calculate the size of results in each document and put it the field. As below:
db.getCollection('your_collection').aggregate([
{
$addFields: {
result_length: { $size: "$results"}
}
},
{
$group: {
_id: '',
total_result_length: { $sum: '$result_length' }
}
}
])
You use an aggregation grouping query with $sum and $size aggregation operators to get the total sum of array elements size for all documents in the collection.
db.collection.aggregate( [
{
$group: {
_id: null,
total_count: { $sum: { $size: "$results" } }
}
}
] )
Aggregation using Mongoose's Model.aggregate():
SomeModel.aggregate([
{
$group: {
_id: null,
total_count: { $sum: { $size: "$results" } }
}
}
]).
then(function (result) {
console.log(result);
});

Monogo aggregate query taking too long on server

i am trying to use aggregate framework in mongo for some data stats. the query i am using, when run on local is hardly taking a a minute but when i run the same query on server it does not give response and after keep on waiting for too long , i had to cancel it, can anyone please suggest why is this happening.
var orderIds = db.delivery.find({"status":"DELIVERED"}).map(function(o) {
return o.order
});
var userIds = db.order.aggregate([{
$match : { _id : { $in : orderIds } }
}, {
$group: { _id : "$customer" }
}]).map(function(u) { return u._id });
var userstats = db.order.aggregate([{
$sort : { customer : 1, dateCreated : 1 }
}, {
$match : { status : "DELIVERED", customer : { $in : userIds } }
}, {
$group: {
_id : "$customer", orders : { $sum : 1 },
firstOrderDate : { $first : "$dateCreated" },
lastOrderDate : { $last : "$dateCreated" }
}
}]);
userstats.forEach(function(x) {
db.user.update({ _id : x._id }, {
$set : {
totalOrders : x.orders,
firstOrderDate : x.firstOrderDate,
lastOrderDate : x.lastOrderDate
}
})
})
I am not sure , but shouldn't it be more fast on server ? , but instead its not able to give output.
To speed up the process you could refactor your operations in a couple of ways.
The first would be to eliminate unnecessary pipeline operations like the $sort operator which could be replaced with the $max and $min operators within the $group pipeline.
Secondly, use the bulk() API which will increase perfromance on update operations especially when dealing with large collections since they will be sending the operations to the server in batches (for example, say a batch size of 500) unlike sending every request to the server (as you are currently doing with the update statement within the forEach() loop).
Consider the following refactored operations:
var orderIds = db.delivery.find({"status": "DELIVERED"}).map(function(d){return d.order;}),
counter = 0,
bulk = db.user.initializeUnorderedBulkOp();
var userstatsCursor = db.orders.aggregate([
{ "$match": { "_id": { "$in": orderIds } } },
{
"$group": {
"_id": "$customer",
"orders": { "$sum": 1 },
"firstOrderDate": { "$min": "$dateCreated" },
"lastOrderDate":{ "$max": "$dateCreated" } }
}
}
]);
userstatsCursor.forEach(function (x){
bulk.find({ "_id": x._id }).updateOne({
"$set": {
"totalOrders": x.orders,
"firstOrderDate": x.firstOrderDate,
"lastOrderDate": x.lastOrderDate
}
});
counter++;
if (counter % 500 == 0) {
bulk.execute(); // Execute per 500 operations and
// re-initialize every 500 update statements
bulk = db.user.initializeUnorderedBulkOp();
}
});
// Clean up remaining operations in queue
if (counter % 500 != 0) { bulk.execute(); }
I recommend you make $match the first operation in your pipeline as the $match operator can only use an index if it is first in the aggregation pipeline:
var userstats = db.order.aggregate([{
$match : {
status :"DELIVERED",
customer : { $in : userIds }
}
}, {
$sort : {
customer : 1,
dateCreated : 1
}
}, {
$group : {
_id : "$customer",
orders : { $sum : 1 },
firstOrderDate: { $first : "$dateCreated" },
lastOrderDate : { $last:"$dateCreated" }
}
}]);
You should also add an index on status and customer if you have not already defined one:
db.delivery.createIndex({status:1,customer:1})

Search for most common "data" with mongoose, mongodb

My structure.
User:
{
name: "One",
favoriteWorkouts: [ids of workouts],
workouts: [ { name: "My workout 1" },...]
}
I want to get list of favorits/hottest workouts from database.
db.users.aggregate(
{ $unwind : "$favorite" },
{ $group : { _id : "$favorite" , number : { $sum : 1 } } },
{ $sort : { number : -1 } }
)
This returns
{
"hot": [
{
"_id": "521f6c27145c5d515f000006",
"number": 1
},
{
"_id": "521f6c2f145c5d515f000007",
"number": 1
},...
]}
But I want
{
hot: [
{object of hottest workout 1, object of hottest workout 2,...}
]}
How do you sort hottest data and fill the result with object, not just ids?
You are correct to want to use MongoDB's aggregation framework. Aggregation will give you the output you are looking for if used correctly. If you are looking for just a list of the _id's of all users' favorite workouts, then I believe that you would need to add an additional $group operation to your pipeline:
db.users.aggregate(
{ $unwind : "$favoriteWorkouts" },
{ $group : { _id : "$favoriteWorkouts", number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $group : { _id : "oneDocumentWithWorkoutArray", hot : { $push : "$_id" } } }
)
This will yield a document of the following form, with the workout ids listed by popularity:
{
"_id" : "oneDocumentWithWorkoutArray",
"hot" : [
"workout6",
"workout1",
"workout5",
"workout4",
"workout3",
"workout2"
]
}

Mongoose nodejs - Grouping with arrays

I have got a collection with the following schema :
{OrderId : id, CustomerId : id, amount: Number, otherPayers : [{name : String, amount : Number}]}
I would like to make an average of the "amount" of all orders for a customerId, and I can do that with aggregate by grouping on customerId with avg on amount.
Now, I'd like the output "amount" field to be not only the avg of "amount", but also of the "amount" field of the "otherPayers" field.
I have had a look into mapreduce but I can't get it to work.
Any idea? In SQL it would have been quite simple with a subquery.
I'm guessing you want to average together all the amounts inside the otherPayers array plus the amount field on the top level of the document. Here is how you would do it with aggregation framework:
unwind = { $unwind : "$otherPayers" };
group = { $ group : { _id : "$CustomerId",
amt : { $first : "$amount" },
count : { $sum : 1 },
total : { $sum : "$otherPayers.amount" }
} };
project = { $project : { _id : 0,
CustomerId : "$_id",
avgPaid : { $divide : [
{ $add : [ "$total", "$amt" ] },
{ $add : [ "$count", 1 ] }
] }
} };
db.collection.aggregate(unwind, group, project);

Resources