My Collection JSON
[
{
"_id" : 0,
"finalAmount":40,
"payment":[
{
"_id":0,
"cash":20
},
{
"_id":1,
"card":20
}
]
},
{
"_id" : 1,
"finalAmount":80,
"payment":[
{
"_id":0,
"cash":60
},
{
"_id":1,
"card":20
}
]
},
{
"_id" : 2,
"finalAmount":80,
"payment":[
{
"_id":0,
"cash":80
}
]
}
]
I want to have the amount, cash and card group wise using aggregation framework. Can anyone help?
Please consider my _id as an ObjectId for demo purpose as I have given 0 and 1. I am using Node Js and MongoDB and I want the expected output in just one query as follows:
Expected Output:
{
"cash":160,
"card":40,
"total":200,
"count":3
}
You could try running the following aggregation pipeline, although there might be some performance penalty or potential aggregation pipeline limits with huge datasets since your initial pipeline tries to group all the documents in the collection to get the total document count and the amount as well as pushing all the documents to a temporary list, which may affect performance down the pipeline.
Nonetheless, the following solution will yield the given desired output from the given sample:
collection.aggregate([
{
"$group": {
"_id": null,
"count": { "$sum": 1 },
"doc": { "$push": "$$ROOT" },
"total": { "$sum": "$finalAmount" }
}
},
{ "$unwind": "$doc" },
{ "$unwind": "$doc.payment" },
{
"$group": {
"_id": null,
"count": { "$first": "$count" },
"total": { "$first": "$total" },
"cash": { "$sum": "$doc.payment.cash" },
"card": { "$sum": "$doc.payment.card" }
}
}
], function(err, result) {
console.log(result);
});
When running on big datasets, this problem might be more suitable, more fast to solve with a map reduce operation, since the result is one singel aggregated result.
var map = function map(){
var cash = 0;
var card = 0;
for (i in this.payment){
if(this.payment[i].hasOwnProperty('cash')){
cash += this.payment[i]['cash']
}
if(this.payment[i].hasOwnProperty('card')){
card += this.payment[i]['card']
}
}
var doc = {
'cash': cash,
'card': card,
};
emit(null, doc);
};
var reduce = function(key, values){
var total_cash = 0;
var total_card = 0;
var total = 0;
for (i in values){
total_cash += values[i]['cash']
total_card += values[i]['card']
}
var result = {
'cash': total_cash,
'card': total_card,
'total': total_cash+ total_card,
'count': values.length
};
return result
};
db.runCommand({"mapReduce":"test", map:map, reduce:reduce, out:{replace:"test2"}})
result:
db.test2.find().pretty()
{
"_id" : null,
"value" : {
"cash" : 160,
"card" : 40,
"total" : 200,
"count" : 3
}
}
Related
So what I want to do is group all documents having same hash whose count is more than 1 and only keep the oldest record according to startDate
My db structure is as follows:
[{
"_id": "82bacef1915f4a75e6a18406",
"Hash": "cdb3d507734383260b1d26bd3edcdfac",
"duration": 12,
"price": 999,
"purchaseType": "Complementary",
"startDate": {
"$date": {
"$numberLong": "1656409841000"
}
},
"endDate": {
"$date": {
"$numberLong": "1687859441000"
}
}
}]
I was using this query which I created
db.Mydb.aggregate([
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
}
},{"$sort":{startDate:-1}},
{
"$match": {
count: { "$gt": 1 }
}
}
]).forEach(function(doc) {
doc.dups.shift();
db.Mydb.deleteMany({
_id: {$in: doc.dups}
});
})
this gives a result like this:
{ _id: { hash: '1c01ef475d072f207c4485d0a6448334' },
dups:
[ '6307501ca03c94389f09b782',
'6307501ca03c94389f09b783',
'62bacef1915f4a75e6a18l06' ],
count: 3 }
The problem with this is that the _id's in dups array are random everytime I run this query i.e. not sorted according to startDate field.
What can be done here?
Any help is appreciated. Thanks!
After $group stage, startDate field will not pre present in the results, so you can not sort based on that field. So, as stated in the comments, you should put $sort stage first in the Aggregation pipeline.
db.Mydb.aggregate([
{
"$sort": { startDate: -1}
},
{
"$group": {
_id: {hash: "$Hash"},
dups: { $addToSet: "$_id" } ,
count: { $sum : 1 }
},
{
"$match": { count: { "$gt": 1 }
}
]
Got the solution. I was using $addToSet in the group pipeline stage which does not allow duplicate values. Instead, I used $push which allows duplicate elements in the array or set.
I am a newbie to MongoDB. I have a use case where the mongo query should be converted to aggregate query.
I have the following two collections:
items: {_id: "something", "name": "raj", "branch": "IT", subItems: ["subItemId","subItemId"]}
subItems: {_id: "something", "city": "hyd", list: ["adsf","adsf"]}
Query passed by the user is:
{
"query": { "name": "raj", "subItems.loc": "hyd" },
"sort": { "name": 1 },
"projection": { "_id": 0, "branch": 1, "subItems.loc": 1 }
}
I am able to create a dynamic query in a following way:
let itemConditions: any[] = [];
let subItemConditions: any[] = [];
let itemProjection: {[k:string]: any} = {};
let subItemProjection: {[k:string]: any} = {};
subItemConditions.push({$in: ["$_id", "$$subItems"]}); // Default condition to search files in the subItems collection
let isAnysubItemProj = false;
Object.keys(reqQuery).forEach(prop => {
let value = reqQuery[prop];
if(prop.includes('subItems.')) {
key = prop.split(".")[1];
if(key === '_id') value = new ObjectId(value);
subItemConditions.push({$eq: [`$${prop}`, value]});
return;
}
itemConditions.push({$eq: [`$${prop}`, value]});
});
if(config.projection)
Object.keys(config.projection).forEach(prop => {
if(prop.includes('subItems.')) {
isAnysubItemProj = true;
const key = prop.split(".")[1];
subItemProjection[key] = config.projection[prop];
return;
}
itemProjection[prop] = config.projection[prop];
});
if(isAnysubItemProj) itemProjection['subItems'] = 1;
let subItemPipeline: any[] = [];
subItemPipeline.push(
{ $match: {
$expr: {
$and: subItemConditions
}
}
});
if(Object.keys(subItemProjection).length)
subItemPipeline.push({$project: subItemProjection});
let query: any[] = [
{
$match: {
$expr : {
$and: itemConditions
}
}
},
{
$addFields: {
subItems: {
$map: {
input: "$subItems",
as: "id",
in: { $toObjectId: "$$id" }
}
}
}
},
{
$lookup: {
from: "subItems",
let: {subItems: "$subItems"},
pipeline: subItemPipeline,
as: "subItems"
}
}
];
if(config.sort && Object.keys(config.sort).length) query.push({$sort: config.sort});
if(Object.keys(itemProjection).length) query.push({$project: itemProjection});
const items = await collection.aggregate(query).toArray();
The above code will work only for the comparison of equality for items and subItems separately, but the user can send different types of queries like:
{
"query": { $or: [{"name": "raj"}, {subItems: {$gt: { $size: 3}}}], "subItems.city": "hyd" },
"sort": { "name": 1 },
"projection": { "_id": 0, "branch": 1, "subItems.loc": 1 }
}
{
"query": { $or: [{"name": "raj"}, {"subItems.city": {"$in" : ["hyd", "ncr"]}}], "subItems.list": {"$size": 2} },
"sort": { "name": 1 },
"projection": { "_id": 0, "branch": 1, "subItems.loc": 1 }
}
Is there any easy way to convert this normal MongoDB query into an aggregate query or is there any other approach to implement this...??
I am trying to modify the above dynamic query to work for any queries passed by the user but it is becoming difficult to handle all queries.
Is there any better approach to handle this situation like changing the query passed by the user or the way of handling it on the server-side or how I should change my code to support all types of queries passed by the user..??
Any help will be appreciated
If this is really your input
input = {
"query": { "name": "raj", "subItems.loc": "hyd" },
"sort": { "name": 1 },
"projection": { "_id": 0, "branch": 1, "subItems.loc": 1 }
}
Then the aggregation pipeline would be simply:
let pipeline = [
{$match: input.query},
{$sort: input.sort},
{$project: input.projection}
]
db.collection.aggregate(pipeline)
I am trying to get some filtered data and it's total count. I want to do both job within single query, so how can I do this. Below is my code.
var SubId = 1;
var TypeId = 1;
var lookup = {
$lookup:
{
from: 'sub_types',
localField: 'sub_id',
foreignField: 'sub_id',
as: 'sub_category'
}
};
var unwind = { $unwind: "$sub_category" };
var project = {
"ques_id": 1,
"ques_txt": 1,
"ans_txt": 1,
"ielts_sub_id": 1,
"ielts_tags_id": 1,
};
var match = {
"sub_category.type_id": parseInt(TypeId),
"sub_category.sub_id": parseInt(SubId),
"status": 1
};
ieltsmongoose.collection('ques').aggregate([
lookup, unwind,
{
$match: match
},
{
$project: project
}
]).limit(max_row).toArray(async function (error, Ques) {....});
Now I want to get count with this same query like
{
$count: "totalcount"
},
You can use another aggregate stage called group like this to achieve your goal:
{ "$group": {
"_id": null,
"count": { "$sum": 1},
"data": { "$push": "$$ROOT" }
}}
your aggregation would be like :
ieltsmongoose.collection('ques').aggregate([
lookup, unwind,
{
$match: match
},
{
$project: project
},
{ "$group": {
"_id": null,
"count": { "$sum": 1},
"data": { "$push": "$$ROOT" }
}
}
])
also you should use limit and offset as a aggregation stage with $limit and $skip
I have a field in my MongoDB collection products called date_expired.
It's of type: date and stores the a date string.
I want to retrieve all the products and change the date_expired property in the result to number of hours left from now. How do I do this?
It's similar to getter() in Laravel...?
You could create a virtual property that will return the number of hours until expiry:
ProductSchema.virtual('hoursToExpiry').get(function() {
return (this.date_expired - Date.now()) / 3600000;
});
To access this property:
console.log('hours to expiry:', doc.hoursToExpiry)
If you want to include that property in any JSON or JS object, make sure that you set virtuals : true:
console.log('%j', doc.toJSON({ virtuals : true }));
Would consider using the aggregation framework in this case to output the transformation. You can use the $project pipeline arithmetic operators $divide and $subtract to achieve the final goal. These will enable you to carry out the arithmetic of calculating the number of hours to expiry i.e. implement the formula:
hoursToExpiry = (date_expired - timeNow)/1000*60*60 //the time units are all in milliseconds
Take for instance the following short mongo shell demo that will strive to drive home this concept:
Populate test collection:
db.test.insert([
{
"date_expired": ISODate("2016-03-27T10:55:13.069Z"),
"name": "foo"
},
{
"date_expired": ISODate("2016-06-11T20:55:13.069Z"),
"name": "bar"
},
{
"date_expired": ISODate("2016-06-11T16:17:23.069Z"),
"name": "buzz"
}
])
Aggregation Operation:
db.test.aggregate([
{
"$project": {
"name": 1,
"dateExpired": "$date_expired",
"dateNow": { "$literal": new Date() },
"hoursToExpiry": {
"$divide": [
{ "$subtract": [ "$date_expired", new Date() ] },
1000*60*60
]
}
}
}
])
Result (at the time of writing):
{
"result" : [
{
"_id" : ObjectId("575c0f6e8101b29fc93e5b9d"),
"name" : "foo",
"dateExpired" : ISODate("2016-03-27T10:55:13.069Z"),
"dateNow" : ISODate("2016-06-11T13:36:21.025Z"),
"hoursToExpiry" : -1826.685543333333
},
{
"_id" : ObjectId("575c0f6e8101b29fc93e5b9e"),
"name" : "bar",
"dateExpired" : ISODate("2016-06-11T20:55:13.069Z"),
"dateNow" : ISODate("2016-06-11T13:36:21.025Z"),
"hoursToExpiry" : 7.314456666666667
},
{
"_id" : ObjectId("575c0f6e8101b29fc93e5b9f"),
"name" : "buzz",
"dateExpired" : ISODate("2016-06-11T16:17:23.069Z"),
"dateNow" : ISODate("2016-06-11T13:36:21.025Z"),
"hoursToExpiry" : 2.683901111111111
}
],
"ok" : 1
}
With the above pipeline, you can then adopt it to your Mongoose implementation with the aggregate() method as basis of your query:
Product.aggregate([
{
"$project": {
"name": 1,
"dateExpired": "$date_expired",
"dateNow": { "$literal": new Date() },
"hoursToExpiry": {
"$divide": [
{ "$subtract": [ "$date_expired", new Date() ] },
1000*60*60
]
}
}
}
]).exec(function (err, result) {
// Handle err
console.log(result);
});
or using the more affluent API:
Product.aggregate()
.project({
"name": 1,
"dateExpired": "$date_expired",
"dateNow": { "$literal": new Date() },
"hoursToExpiry": {
"$divide": [
{ "$subtract": [ "$date_expired", new Date() ] },
1000*60*60
]
}
})
.exec(function (err, result) {
// Handle err
console.log(result);
});
Product and variants price
Find Minimum of all products.Variants.Price where size is small and update it by 15%
{
"_id" : 23,
"name" : "Polo Shirt",
"Variants" : [
{
"size" : "Large",
"Price" : 82.42
},
{
"size" : "Medium",
"Price" : 20.82 // this should get increased by 15%
},
{
"size" : "Small",
"Price" : 42.29
}
]
},
{
"_id" : 24,
"name" : "Polo Shirt 2",
"Variants" : [
{
"size" : "Large",
"Price" : 182.42
},
{
"size" : "Medium",
"Price" : 120.82 // this should get increased by 15%
},
{
"size" : "Small",
"Price" : 142.29
}
]
}
I started something like this. Not sure if this is the right start
db.products.find().forEach(function(product){
var myArr = product.Variants;
print(myArr.min());
});
There is a problem here in that you cannot in a single update statement identify the "minimum" value in an array to use with a positional update, so you are right in a way with your current approach.
It is arguable that a better approach would be to pre-determine which element is the minimal element and this pass that to the update. You can do this using .aggregate():
var result = db.products.aggregate([
{ "$unwind": "$Variants" },
{ "$sort": { "_id": 1, "Variants.price" } }
{ "$group": {
"_id": "$_id",
"size": { "$first": "$Variants.size" },
"price": { "$first": "$Variants.price" }
}},
{ "$project": {
"size": 1,
"price": 1,
"adjusted": { "$multiply": [ "$price", 1.15 ] }
}}
])
So of course that is yet only a result with simply the lowest Variant item details for each product but then you could use the results like this:
result.result.forEach(function(doc) {
db.products.update(
{
"_id": doc._id,
"Variants": { "$elemMatch": {
"size": doc.size,
"price": doc.price
}}
},
{
"$set": {
"Variants.$.price": doc.adjusted
}
}
}
})
That is not the best form but it does at least remove some of the overhead with iterating an array and allows a way to do the calculations on the server hardware, which is possibly of a higher spec from the client.
It still doesn't really look like too much though until you take in some features available for MongoDB 2.6 and upwards. Notably that aggregate gets a cursor for a response and that you can now also do "bulk updates". So the form can be changed like so:
var cursor = db.products.aggregate([
{ "$unwind": "$Variants" },
{ "$sort": { "_id": 1, "Variants.price" } }
{ "$group": {
"_id": "$_id",
"size": { "$first": "$Variants.size" },
"price": { "$first": "$Variants.price" }
}},
{ "$project": {
"size": 1,
"price": 1,
"adjusted": { "$multiply": [ "$price", 1.15 ] }
}}
]);
var batch = [];
while ( var doc = cursor.next() ) {
batch.push({
"q": {
"_id": doc._id,
"Variants": { "$elemMatch": {
"size": doc.size,
"price": doc.price
}}
},
"u": {
"$set": {
"Variants.$.price": doc.adjusted
}
}
});
if ( batch.length % 500 == 0 ) {
db.command({ "update": "products", "updates": batch });
}
}
db.command({ "update": "products", "updates": batch });
So that is really nice in that while you are still iterating over a list the traffic and waiting for responses over the wire has really been minimized. The best part is the batch updates which are occurring ( by the math usage ) only once per 500 items. The maximum size of the batch items is actually the BSON limit of 16MB so you can tune that as appropriate.
That gives a few good reasons if you are currently developing a product to move to the 2.6 version.
The only final footnote I would add considering you are dealing with "prices" is try not to use floating point math for this and look for a form using whole integers instead as it avoids a lot of problems.
This is how I pulled it off.
var result = db.Products.aggregate(
[ { "$unwind":"$Variants" },{"$match":{"Variants.size":"small"}},
{ "$group":
{"_id":"$_id","minprice":
{"$min":"$Variants.price" }}},
{$sort:{ _id: 1}} ] )
result.result.forEach(function(doc) {
db.Products.update( { "_id": doc._id },
{ "$pull": { "Variants" : {
"price":doc.minprice,
"size":"small"
} } } ,
{ $addToSet: { "Variants":{
"price":doc.minprice*1.15,
"size":"small"
} }
);
});