Mongo aggregate - Split interval value into months affected - node.js

I have a tricky aggregation of data in mongo and I have no idea how to achieve it directly in mongo without no later data processing.
Here is an simplified example of documents in my collection
[
{
"from" : ISODate("2017-01-15T00:00:00.000Z"),
"to" : ISODate("2017-02-15T00:00:00.000Z"),
"value" : 1000
},
{
"from" : ISODate("2017-02-01T00:00:00.000Z"),
"to" : ISODate("2017-02-28T00:00:00.000Z"),
"value" : 2000
},
{
"from" : ISODate("2017-02-20T00:00:00.000Z"),
"to" : ISODate("2017-03-14T00:00:00.000Z"),
"value" : 1000
}
]
No I would like to get monthly sum of values belonging to a specific month.
[
{janurary: 500}, /* 1/2 of interval id 1 is January so take half the value */
{february: 2833}, /* 500 + 2000 + 333 */
{march: 666}, /* 2/3 of interval id 3 is March */
]
Calculation has to be precise so I can't simplify things by saying all months have exactly 30 days. But what I can do is provide this information from code for each month of the interval. So it should be possible to provide this query information january2017 = 31 days, february2017 = 28 days, march2017 = 31 days
I know I could do this in my node.js code but there might be A LOT of documents in that DB so I would rather not fetch all of these to server to perform the calculation.

Pah, I hope somebody else comes up with a nicer answer but here is one way of getting there:
db.collection.aggregate({
$addFields: {
dayFrom: { $dayOfMonth: "$from" },
dayTo: { $dayOfMonth: "$to" },
monthFrom: { $month: "$from" },
monthTo: { $month: "$to" },
numberOfDays: { $subtract: [ { $dayOfMonth: "$to" }, { $dayOfMonth: "$from" } ] },
numberOfMonths: { $subtract: [ { $month: "$to" }, { $month: "$from" } ] },
}
}, {
$addFields: {
numberOfDaysInFromMonth: { $dayOfMonth: { $subtract: [ { $dateFromParts : { year: { $year: "$from" }, month: { $add: [ "$monthFrom", 1 ] }, day: 1 } }, 1 ] } },
}
}, {
$addFields: {
numberOfDaysAccountingForFromMonth: { $subtract: [ { $add: [ "$numberOfDaysInFromMonth", 1 ] }, "$dayFrom" ] },
numberOfDaysAccountingForToMonth: { $subtract: [ "$dayTo", 1 ] }, // assuming the "to" day does not count anymore
}
}, {
$addFields: {
totalNumberOfDays: { $add: [ "$numberOfDaysAccountingForFromMonth", "$numberOfDaysAccountingForToMonth" ] }
}
}, {
$addFields: {
percentageAccountingForFromMonth: { $divide: [ "$numberOfDaysAccountingForFromMonth", "$totalNumberOfDays" ] },
percentageAccountingForToMonth: { $divide: [ "$numberOfDaysAccountingForToMonth", "$totalNumberOfDays" ] },
}
}, {
$facet: {
"from": [{
$group: {
_id: "$monthFrom",
sum: { $sum: { $multiply: [ "$value", "$percentageAccountingForFromMonth" ] } }
}
}],
"to": [{
$group: {
_id: "$monthTo",
sum: { $sum: { $multiply: [ "$value", "$percentageAccountingForToMonth" ] } }
}
}]
}
}, {
$project: {
total: { $concatArrays: [ "$from", "$to" ] }
}
}, {
$unwind: "$total"
}, {
$group: {
_id: "$total._id",
sum: { $sum: "$total.sum" }
}
})
Some remarks:
You will need to refine that to match your precise definition of
what forms part of a date range and how to count the number of days
("is 2018-01-30 to 2018-01-31 one day or is it two days?").
You might be able to beautify that query using $let and
some nesting. I thought it would be easier to use subsequent $addFields stages to make the beast easier to follow through.
The code does not support from and to values that touch more than two months (e.g. 2018-01-01 to 2018-03-01).

Related

Combine multiple projections with conditions in MongoDB aggregation

Orders.aggregate([{$match: { shippingType: "standardShipping" }},
{ $project: { standardShippingCount: { $size: "$products" } } },
])
Orders.aggregate([{$match: { shippingType: "expressShipping" } },
{ $project: { expressShippingCount: { $size: "$products" } } },
])
I need help to find out if it's possible to write these 2 queries in 1. Any help is appericated.
If you only need the number of total products ordered with each shippingType from the whole collection, then you can use this aggregate query
[
{
$unwind: {
path: "$products"
},
},
{
$group: {
_id: "$shippingType",
count: {
$count: {}
}
}
},
]
It will give you a response in this format
[
{
"_id": "expressShipping",
"count": 14
},
{
"_id": "standardShipping",
"count": 6
}
]
Now to convert it into a format that suits your use-case
queryResponse.reduce((acc, curr) => {
acc[`${curr._id}Count`] = curr.count;
return acc;
}, {})
Finally, you'll have this
{
expressShippingCount: 14,
standardShippingCount: 6
}

how to calculate total for each enum of a field in aggregate?

hello I have this function where I want to calculate the number of orders for each status in one array, the code is
let statusEnum = ["pending", "canceled", "completed"];
let userOrders = await Orders.aggregate([
{
$match: {
$or: [
{ senderId: new mongoose.Types.ObjectId(req.user._id) },
{ driverId: new mongoose.Types.ObjectId(req.user._id) },
{ reciverId: new mongoose.Types.ObjectId(req.user._id) },
],
},
},
{
$group: {
_id: null,
totalOrders: { $sum: 1 },
totalPendingOrders: "??", //I want to determine this for each order status
totalCompletedOrders: "??",
totalCanceledOrders: "??",
},
},
]);
so I could add add a $match and use {status : "pending"} but this will filter only the pending orders, I could also map the status enum and replace each element instead of the "pending" above and then push each iteration in another array , but that just seems so messy, is there any other way to calculate total for each order status with using only one aggregate?
thanks
You can use group as you used, but with condition
db.collection.aggregate([
{
$group: {
_id: null,
totalPendingOrders: {
$sum: { $cond: [ { $eq: [ "$status", "pending" ] }, 1, 0 ] }
},
totalCompletedOrders: {
$sum: { $cond: [ { $eq: [ "$status", "completed" ] }, 1, 0 ] }
},
totalCanceledOrders: {
$sum: { $cond: [ { $eq: [ "$status", "canceled" ] }, 1, 0 ] }
}
}
}
])
Working Mongo playground

How to sum up one record per day for one month?

I’m trying to do to sum up values in a month, but i only want to sum 1 entry per day for 1 month, i tried to use $limit before the group but only get 1 entry return.
Thanks.
[
{
$match:
{
"fPort": 60,
"rxInfo.time":
{
"$gte": "2020-12-01T00:00:00.000000Z",
"$lte": "2020-12-31T59:59:59.935Z"
}
}
},
{ $limit: '1' }, ///// This only returns 1 record and not 1 per day.
{
$group:
{
_id: "9cd9cb00000124c1",
"Total":
{
"$sum": "$object.TotalDaily"
}
}
}
]
I managed with the following query.
[
{
$match:
{
"fPort": 60,
"rxInfo.time":
{
"$gte": "2020-12-01T00:00:00.000000Z",
"$lte": "2020-12-31T59:59:59.99"
}
}
},
{
$project:
{
_id:1,
year:
{
$year:
{
$dateFromString:
{
dateString:
{"$arrayElemAt": [ "$rxInfo.time", 0 ] }
}
}
},
month:
{
$month:
{
$dateFromString:
{
dateString:
{"$arrayElemAt": [ "$rxInfo.time", 0 ] }
}
}
},
day:
{
$dayOfMonth:
{
$dateFromString:
{
dateString:
{"$arrayElemAt": [ "$rxInfo.time", 0 ] }
}
}
},
sum : "$object.TotalDaily"
}
},
{
$group:{ _id:{year:"$year",month:"$month",day:"$day"},TotalPerDay:{$sum:"$sum"}}
},
{
$group: { _id: "9cd9cb00000124c1","MonthlyTotal":{"$sum": "$TotalPerDay"},
}
}
]

Retrieving different aggregated fields with mongoose

I am trying to wrap my head around the query which I am trying to make with mongoose on Node JS. Here is my dataset:
{"_id":{"$oid":"5e49c389e3c23a1da881c1c9"},"name":"New York","good_incidents":{"$numberInt":"50"},"salary":{"$numberInt":"50000"},"bad_incidents":"30"}
{"_id":{"$oid":"5e49c3bbe3c23a1da881c1ca"},"name":"Cairo","bad_incidents":{"$numberInt":"59"},"salary":{"$numberInt":"15000"}}
{"_id":{"$oid":"5e49c42de3c23a1da881c1cb"},"name":"Berlin","incidents":{"$numberInt":"30"},"bad_incidents":"15","salary":{"$numberInt":"55000"}}
{"_id":{"$oid":"5e49c58ee3c23a1da881c1cc"},"name":"New York","good_incidents":{"$numberInt":"15"},"salary":{"$numberInt":"56500"}}
What I am trying to do is get these values:
The most repeated city in collection
The average of bad_incidents
The maximum value of good_incidents
Maximum salary where there are no bad_incidents
I am trying to wrap my head around how I can do this in one query, because I only need one value per field. I would be glad if somebody would lead me on the right track. No need for full solution
Regards!
You may perform MongoDB aggregation with $facet operator which allows compute several aggregation at once.
db.collection.aggregate([
{
$facet: {
repeated_city: [
{
$group: {
_id: "$name",
name: {
$first: "$name"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{
$sort: {
count: -1
}
},
{
$limit: 1
}
],
bad_incidents: [
{
$group: {
_id: null,
avg_bad_incidents: {
$avg: {
$toInt: "$bad_incidents"
}
}
}
}
],
good_incidents: [
{
$group: {
_id: null,
max_good_incidents: {
$max: {
$toInt: "$good_incidents"
}
}
}
}
],
max_salary: [
{
$match: {
bad_incidents: {
$exists: false
}
}
},
{
$group: {
_id: null,
max_salary: {
$max: {
$toInt: "$salary"
}
}
}
}
]
}
},
{
$replaceWith: {
$mergeObjects: [
{
$arrayElemAt: [
"$repeated_city",
0
]
},
{
$arrayElemAt: [
"$bad_incidents",
0
]
},
{
$arrayElemAt: [
"$good_incidents",
0
]
},
{
$arrayElemAt: [
"$max_salary",
0
]
}
]
}
}
])
MongoPlayground
[
{
"_id": null,
"avg_bad_incidents": 34.666666666666664,
"count": 2,
"max_good_incidents": 50,
"max_salary": 56500,
"name": "New York"
}
]

Use calculated value for comparison in aggregation

var data_form = {
{
_id : "123",
result:{
run:10
},
result_re:{
run:10
},
result_ch:{
run:10
},
result_qm:{
run:10
}
},
{
_id : "345",
result:{
run:20
},
result_re:{
run:20
},
result_ch:{
run:20
},
result_qm:{
run:20
}
},
{
_id : "567",
result:{
run:30
},
result_re:{
run:30
},
result_ch:{
run:30
},
result_qm:{
run:30
}
}
}
var pipeline = [
{ $project: {
total: { $add: [ "$result.run", "$result_re.run", "$result_ch.run", "$result_qm.run"] } ,
discount:{
$cond: [ { $gt: [ total , 50 ] }, 1, 0]
}
}
},
{ $sort: {total: -1}},
{ $limit : 10 }
]
db.getCollection('game_users').aggregate(pipeline)
I need to compare total output with aggregation condition and counter increase if condition match.
My collection is defined in data_form variable.
total field output get from query and if that total is grater than 50 after that counter increase.
You need to specify the expression within the $cond. You cannot reference the value of another calculated field within the same pipeline stage. Either do it twice or put in separate stages. The same stage is the most efficient:
var pipeline = [
{ $project: {
total: {
$add: [
"$result.run",
"$result_re.run",
"$result_ch.run",
"$result_qm.run"
]
} ,
discount:{
$cond: [
{ $gt: [
{ $add: [
"$result.run",
"$result_re.run",
"$result_ch.run",
"$result_qm.run"
]},
50
]},
1,
0
]
}
}},
{ $sort: {total: -1}},
{ $limit : 10 }
]
Or separate the $project in two stages
var pipeline = [
{ $project: {
total: {
$add: [
"$result.run",
"$result_re.run",
"$result_ch.run",
"$result_qm.run"
]
}
}},
{ $project: {
total: 1,
discount:{
$cond: [
{ $gt: [ "$total", 50 ] }
1,
0
]
}
}},
}}
{ $sort: {total: -1}},
{ $limit : 10 }
]
This looks "prettier" but running another stage means another pass through data, so it's probably best to do in one stage.
To get the "totals" across the collection, run a separate aggregation to the paged results.
var pipeline = [
{ $group: {
_id: null,
total: {
$sum: {
$add: [
"$result.run",
"$result_re.run",
"$result_ch.run",
"$result_qm.run"
]
}
} ,
discount:{
$sum: {
$cond: [
{ $gt: [
{ $add: [
"$result.run",
"$result_re.run",
"$result_ch.run",
"$result_qm.run"
]},
50
]},
1,
0
]
}
}
}}
];
Do not try and get both the paged results and the total in the same response since that is not how you do it. These should be run separately as attempting to return in one result will certainly break the BSON limit in real world use cases.

Resources