I have this schema
var salesExpenseSchema = new Schema({
date : {
month: Number
},
sales: [{amount : Schema.Types.Decimal128}],
expenses: [{amount : Schema.Types.Decimal128}]
});
Example of a database record is like this
{
_id:'5dbac5dfa1488240cbc4f838',
date:{month:11},
sales:[{amount:3000},{amount:5000}],
expenses: [{amount:5000},{amount:500}]
},
{
_id:'5dbac5dfa1488240cbc4f838',
date:{month:10},
sales:[{amount:2000},{amount:5000}],
expenses: [{amount:500},{amount:800}]
},
{
_id:'5dbac5dfa1488240cbc4f838',
date:{month:09},
sales:[{amount:2000},{amount:4000}],
expenses: [{amount:200},{amount:300}]
}
Now I want to get the Summation of sales and expenses.
I have used Aggregate with $unwind for both sales and expenses like this below:
SalesExpense.aggregate([
{$unwind: "$sales"},
{$unwind: "$expenses"},
{$group:{
_id:'$_id',
sales:{$sum: "$sales.sellPrices"},
expenses:{$sum: "$expenses.amount"},
}
},
But the problem is... If one array document has data and the other has no data, then it gives 0 ie, the real summation isn't obtained. This is to say, if there are sales but no expenses then their sum becomes 0, and vice-versa.
I want to get the summation for both sales and expenses regardless of one of them not having data. How do I achieve this?
EDIT:
I have edited the question and added the date object in my schema and in the database records: I want to make this summation based on each month, that is to say... each month to have its own sales and expenses... Sort of a timeline with each month having its own sales and expenses.
I have tried using $group before $project
{$group:{
_id:'$date.month'}}
But it seems not to give the expected results.
I want an output like this one:
[
{
"month": "11",
"sales": {
"$numberDecimal": "8000"
},
"expenses": {
"$numberDecimal": "5500"
}
},
{
"month": "10",
"sales": {
"$numberDecimal": "7000"
},
"expenses": {
"$numberDecimal": "1100"
}
},
{
"month": "09",
"sales": {
"$numberDecimal": "6000"
},
"expenses": {
"$numberDecimal": "500"
}
},
]
How can I achieve this?
You can group by month and get the totals like this:
db.collection.aggregate([
{
$group: {
_id: "$date.month",
"sales": {
"$sum": {
"$sum": "$sales.amount"
}
},
"expenses": {
"$sum": {
"$sum": "$expenses.amount"
}
}
}
}
])
Sample Data:
[
{
_id: "5dbac5dfa1488240cbc4f838",
date: {
month: 11
},
sales: [
{
amount: 1
},
{
amount: 2
}
],
expenses: []
},
{
_id: "5dbac5dfa1488240cbc4f839",
date: {
month: 11
},
sales: [
{
amount: 5
},
{
amount: 6
}
],
expenses: [
{
amount: 7
},
{
amount: 8
}
]
},
{
_id: "5dbac5dfa1488240cbc4f840",
date: {
month: 12
},
sales: [],
expenses: [
{
amount: 7
},
{
amount: 8
}
]
}
]
Result:
[
{
"_id": 12,
"expenses": 15,
"sales": 0
},
{
"_id": 11,
"expenses": 15,
"sales": 14
}
]
Playground:
https://mongoplayground.net/p/K9ofoZx5ORI
Related
I have a Mongo database filled with "Events" records, that look like this:
{
timestamp: 2022-03-15T22:11:34.711Z,
_id: new ObjectId("62310f16b0d71321e887a905")
}
Using a NodeJs server, I need to fetch the last 30 days of Events, grouped/summed by date, and any dates within that 30 days with no records need to be filled with 0.
Using this code I can get the correct events, grouped/summed by date:
Event.aggregate( [
{
$match: {
timestamp: {
$gte: start,
$lte: end,
}
}
},
{
$project: {
date: {
$dateToParts: { date: "$timestamp" }
},
}
},
{
$group: {
_id: {
date: {
year: "$date.year",
month: "$date.month",
day: "$date.day"
}
},
"count": { "$sum": 1 }
}
}
] )
This will return something like this:
[
{
"_id": {
"date": {
"year": 2022,
"month": 3,
"day": 14
}
},
"count": 3
},
{
"_id": {
"date": {
"year": 2022,
"month": 3,
"day": 15
}
},
"count": 8
},
]
I also have this Javascript code to generate the last 30 days of dates:
const getDateRange = (start, end) => {
const arr = [];
for(let dt = new Date(start); dt <= end; dt.setDate(dt.getDate() + 1)){
arr.push(new Date(dt));
}
return arr;
};
const subtractDays = (date, days) => {
return new Date(date.getTime() - (days * 24 * 60 * 60 * 1000));
}
const end = new Date();
const start = subtractDays(end, 30);
const range = getDateRange(start, end);
Which returns something like this:
[
2022-03-09T01:13:10.769Z,
2022-03-10T01:13:10.769Z,
2022-03-11T01:13:10.769Z,
2022-03-12T01:13:10.769Z,
2022-03-13T01:13:10.769Z,
...
]
It seems like I have all the pieces, but I'm having trouble putting all this together to do what I need in an efficient way. Any push in the right direction would be appreciated.
Whenever one has to work with date/time arithmetic then I recommend a library like moment.js
const end = moment().startOf('day').toDate();
const start = moment().startOf('day').subtract(30, 'day').toDate();
In MongoDB version 5.0 you can use $dateTrunc(), which is shorter than $dateToParts and { year: "$date.year", month: "$date.month", day: "$date.day" }
You need to put all data in an array ({$group: {_id: null, data: { $push: "$$ROOT" }}) and then at missing elements with $ifNull:
event.aggregate([
{
$match: {
timestamp: { $gte: start, $lte: end }
}
},
{
$group: {
_id: { $dateTrunc: { date: "$timestamp", unit: "day" } },
count: { $sum: 1 }
}
},
{ $project: {timestamp: "$_id", count: 1, _id: 0} },
{
$group: {
_id: null,
data: { $push: "$$ROOT" }
}
},
{
$set: {
data: {
$map: {
input: { $range: [0, 30] },
as: "i",
in: {
$let: {
vars: {
day: { $dateAdd: { startDate: start, amount: "day", unit: "$$i" } }
},
in: {
$ifNull: [
{
$first: {
$filter: {
input: "$data",
cond: { $eq: ["$$this.timestamp", "$$day"] }
}
}
},
{ timestamp: "$$day", count: 0 }
]
}
}
}
}
}
}
},
{ $unwind: "$data" }
])
$range operator supports only integer values, that's the reason for using $let. Otherwise, if you prefer to use the external generated range, it would be
{
$set: {
data: {
$map: {
input: range,
as: "day",
in: {
$ifNull: [
{
$first: {
$filter: {
input: "$data",
cond: { $eq: ["$$this.timestamp", "$$day"] }
}
}
},
{ timestamp: "$$day", count: 0 }
]
}
}
}
}
}
And for MongoDB version 5.1 you may have a look at $densify
Use aggregation stage densify if you're using MongoDB version 5.1 or later. But for lower version, below query can be used.
db.collection.aggregate([
{
$match: {
timestamp: {
$gte: {
"$date": "2022-03-01T00:00:00.000Z"
},
$lte: {
"$date": "2022-03-31T23:59:59.999Z"
},
}
}
},
{
$project: {
date: {
$dateToParts: {
date: "$timestamp"
}
},
}
},
{
$group: {
_id: {
date: {
year: "$date.year",
month: "$date.month",
day: "$date.day"
}
},
"count": {
"$sum": 1
}
}
},
{
"$group": {
"_id": null,
"originData": {
"$push": "$$ROOT"
}
}
},
{
"$project": {
"_id": 0,
"data": {
"$concatArrays": [
{
"$map": {
"input": {
"$range": [
0,
30,
1
]
},
"in": {
"$let": {
"vars": {
"date": {
"$add": [
{
"$date": "2022-03-01T00:00:00.000Z"
},
{
"$multiply": [
"$$this",
86400000
]
}
]
}
},
"in": {
"_id": {
"date": {
"day": {
"$dayOfMonth": "$$date"
},
"month": {
"$month": "$$date"
},
"year": {
"$year": "$$date"
}
}
},
"count": 0
}
}
}
}
},
"$originData"
]
}
}
},
{
"$unwind": "$data"
},
{
$group: {
_id: {
date: {
year: "$data._id.date.year",
month: "$data._id.date.month",
day: "$data._id.date.day"
}
},
"count": {
"$sum": "$data.count"
}
}
},
{
"$sort": {
"_id.date.year": 1,
"_id.date.month": 1,
"_id.date.day": 1
}
}
])
Link to online playground. https://mongoplayground.net/p/5I0I04HoHXm
I'm using this query to get count of orders for 7 days of currently week. The result I get is something like this:
[
{ _id: '2021-01-31', orders: 3 },
{ _id: '2021-02-01', orders: 1 },
{ _id: '2021-02-02', orders: 2 },
{ _id: '2021-02-06', orders: 2 }
]
The problem is that if there was no order on specific day, the orders count should be 0 for that date. For example { _id: '2021-02-03', orders: 0 }, there was no order on 03-02-2021 it should be 0.
This is the query that I'm using:
let d = new Date()
d = new Date(Date.UTC(d.getFullYear(), d.getMonth(), d.getDate()));
d.setUTCDate(d.getUTCDate() + 4 - (d.getUTCDay()||7));
let yearStart = new Date(Date.UTC(d.getUTCFullYear(),0,1));
let weekNumber = Math.ceil(( ( (d - yearStart) / 86400000) + 1)/7);
const ordersweekly = await Order.aggregate([
{
"$set": { "date": { "$week": "$createdAt" } }
},
{
"$match": { "date": weekNumber }
},
{
"$set": { "date": { "$dateToString": { "format": "%Y-%m-%d", "date": "$createdAt" } } }
},
{
"$group": { "_id": "$date", "orders": { "$sum": 1 } }
},
{ "$sort": { "_id": 1 } }
])
My collection in mongodb looks like below:
Post:
// ...
tags: [
{
id: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Tag',
required: true,
}
// ...
}
],
date: {
type: Date,
}
// ...
I want to write a query which results the below result:
[
{
"month": "Jan",
"tag1": 5,
"tag2": 80,
// ...
},
{
"month": "Feb",
"tag1": 30,
"tag2": 95,
// ...
},
// ...
]
I think I need to use aggregation. Is it right?
I wrote this but the result is not that I want.
const monthStrings = ["", "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"];
Posst.aggregate([
{
$match: {
$expr: {
$and: [
{ $gt: ["$created_at", oneYearFromNow] },
{ $lt: ["$created_at", dateNow] }
],
}
}
},
{
$group: {
_id: {
month: { $month: "$date" },
year: { $year: "$date" },
},
count: {
$sum: 1
}
}
},
{
$project: {
_id: {
$concat: [
{
$arrayElemAt: [
monthStrings,
"$_id.month"
]
},
"-",
"$_id.year"
]
},
count: 1,
}
}
])
How can I get the result that I want?
(The returned format is not really important, but what I am trying to achieve is in a single query to retrieve a number of counts for the same grouping (one per month).)
In order to count documents for each tag, you need to group by the tag identifier. Since tags are in an array, the easiest way is to first unwind the array before grouping. Finally, in order to get all tags for the same month in a single document, you need to perform a second group operation. For example (assuming the name of the tag you want to use is in the "ref" field):
Posst.aggregate([
{
$match: ...
},
{
// Unwind the tags array, produces one document for each element in the array
$unwind: '$tags'
},
{
// Group by month, year and tag reference
$group: {
_id: {
month: { $month: '$date' },
year: { $year: '$date' },
tag: '$tag.ref'
},
count: { $sum: 1 }
}
},
{
// Group again by month and year to aggregate all tags in one document
$group: {
_id: {
month: '$_id.month',
year: '$_id.year'
},
// Collect tag counts into an array
tagCounts: {
$push: {
k: '$_id.tag',
v: '$count'
}
}
}
},
{
// Reshape document to have each tag in a separate field
$replaceRoot: {
newRoot: {
$mergeObjects: [
{ month: '$_id.month', year: '$_id.year' },
{ $arrayToObject: '$tagCounts' }
]
}
}
}
])
Can you please help? I'm trying to aggregate data over the past 12 months by both ALL publication data specific to a certain publisher and per publication to return a yearly graph data analysis based on the subscription type.
Here's a snapshot of the Subscriber model:
const SubscriberSchema = new Schema({
publication: { type: Schema.Types.ObjectId, ref: "publicationcollection" },
subType: { type: String }, // Print, Digital, Bundle
subStartDate: { type: Date },
subEndDate: { type: Date },
});
Here's some data for the reader (subscriber) collection:
{
_id: ObjectId("5dc14d3fc86c165ed48b6872"),
publication: ObjectId("5d89db9d82273f1d18970deb"),
subStartDate: "2019-11-20T00:00:00.000Z",
subtype: "print"
},
{
_id: ObjectId("5dc14d3fc86c165ed48b6871"),
publication: ObjectId("5d89db9d82273f1d18970deb"),
subStartDate: "2019-11-19T00:00:00.000Z",
subtype: "print"
},
{
_id: ObjectId("5dc14d3fc86c165ed48b6870"),
publication: ObjectId("5d89db9d82273f1d18970deb"),
subStartDate: "2019-11-18T00:00:00.000Z",
subtype: "digital"
},
{
_id: ObjectId("5dc14d3fc86c165ed48b6869"),
publication: ObjectId("5d8b36c3148c1e5aec64662c"),
subStartDate: "2019-11-19T00:00:00.000Z",
subtype: "print"
}
The publication model has plenty of fields but the _id and user fields are the only point of reference in the following queries.
Here's some data for the publication collection:
// Should use
{ "_id": {
"$oid": "5d8b36c3148c1e5aec64662c"
},
"user": {
"$oid": "5d24bbd89f09024590db9dcd"
},
"isDeleted": false
},
// Should use
{ "_id": {
"$oid": "5d89db9d82273f1d18970deb"
},
"user": {
"$oid": "5d24bbd89f09024590db9dcd"
},
"isDeleted": false
},
// Shouldn't use as deleted === true
{ "_id": {
"$oid": "5d89db9d82273f1d18970dec"
},
"user": {
"$oid": "5d24bbd89f09024590db9dcd"
},
"isDeleted": true
},
// Shouldn't use as different user ID
{ "_id": {
"$oid": "5d89db9d82273f1d18970dfc"
},
"user": {
"$oid": "5d24bbd89f09024590db9efd"
},
"isDeleted": true
}
When I do a lookup on a publication ID with the following, I'm getting perfect results:
Subscriber.aggregate([
{
$match: {
$and: [
{ 'publication': mongoose.Types.ObjectId(req.params.id) },
],
"$expr": { "$eq": [{ "$year": "$subStartDate" }, new Date().getFullYear()] }
}
},
{
/* group by year and month of the subscription event */
$group: {
_id: { year: { $year: "$subStartDate" }, month: { $month: "$subStartDate" }, subType: "$subType" },
count: { $sum: 1 }
},
},
{
/* sort descending (latest subscriptions first) */
$sort: {
'_id.year': -1,
'_id.month': -1
}
},
{
$limit: 100,
},
])
However, when I want to receive data from the readercollections (Subscriber Model) for ALL year data, I'm not getting the desired results (if any) from all of the things I'm trying (I'm posting the best attempt result below):
Publication.aggregate([
{
$match:
{
user: mongoose.Types.ObjectId(id),
isDeleted: false
}
},
{
$project: {
_id: 1,
}
},
{
$lookup: {
from: "readercollections",
let: { "id": "$_id" },
pipeline: [
{
$match:
{
$expr: {
$and: [
{ $eq: ["$publication", "$$id"] },
{ "$eq": [{ "$year": "$subStartDate" }, new Date().getFullYear()] }
],
}
}
},
{ $project: { subStartDate: 1, subType: 1 } }
],
as: "founditem"
}
},
// {
// /* group by year and month of the subscription event */
// $group: {
// _id: { year: { $year: "$founditem.subStartDate" }, month: { $month: "$foundtitem.subStartDate" }, subType: "$founditem.subType" },
// count: { $sum: 1 }
// },
// },
// {
// /* sort descending (latest subscriptions first) */
// $sort: {
// '_id.year': -1,
// '_id.month': -1
// }
// },
], function (err, result) {
if (err) {
console.log(err);
} else {
res.json(result);
}
})
Which returns the desired data without the $group (commented out) but I need the $group to work or I'm going to have to map a dynamic array based on month and subtype which is completely inefficient.
When I'm diagnosing, it looks like this $group is the issue but I can't see how to fix as it works in the singular $year/$month group. So I tried the following:
{
/* group by year and month of the subscription event */
$group: {
_id: { year: { $year: "$subStartDate" }, month: { $month: "$subStartDate" }, subType: "$founditem.subType" },
count: { $sum: 1 }
},
},
And it returned the $founditem.subType fine, but any count or attempt to get $year or $month of the $founditem.subStartDate gave a BSON error.
The output from the single publication ID lookup in the reader collection call that works (and is plugging into the line graph perfectly) is:
[
{
"_id": {
"year": 2019,
"month": 11,
"subType": "digital"
},
"count": 1
},
{
"_id": {
"year": 2019,
"month": 11,
"subType": "print"
},
"count": 3
}
]
This is the output I'd like for ALL publications rather than just a single lookup of a publication ID within the reader collection.
Thank you for any assistance and please let me know if you need more details!!
My mongodb aggregation query should return daily, weekly, monthly, yearly data within specific date ranges.
Example: let's assume dates between 1st Feb to 1st March.
the data for daily should be: 1st, 2nd, 3rd..
Weekly period will be: 1st Feb, 8th Feb, 15th Feb, 22nd Feb ..
monthly period will be 1st Feb, 1st march ..
Look at example below:
Let's say my API accepts: startDate, endDate, interval as body params.
req.body will be something like this:
{
startDate: "",
endDate: "",
platform: "",
interval: "daily" // could be "weekly", "monthly", "yearly"
}
These params will be passed to my model where I have some aggregation code which will be mentioned below:
MessagesSchema.statics.totalMessages = ( startDate, endDate, platform, interval ) => {
return Messages.aggregate([{
$match: {
platform: platform,
timestamp: {
$gte: new Date(startDate),
$lte: new Date(endDate)
}
}
},
{
$project: {
timestamp: {
$dateToString: {
format: '%Y-%m-%d',
date: '$timestamp'
}
}
}
},
{
$group: {
_id: {
timestamp: '$timestamp'
},
count: {
$sum: 1
}
}
},
{
$sort: {
'_id.timestamp': 1
}
}
]).exec();
Let's assume Weekly data from 1st Feb 2019 - 1st March 2019;
expected result:
[
{
"_id": {
"timestamp": "2019-02-01"
},
"count": 2
},
{
"_id": {
"timestamp": "2019-02-08"
},
"count": 2
},
{
"_id": {
"timestamp": "2019-02-15"
},
"count": 2
}
]
actual result:
[
{
"_id": {
"timestamp": "2019-02-01"
},
"count": 2
},
{
"_id": {
"timestamp": "2019-03-02"
},
"count": 2
},
{
"_id": {
"timestamp": "2019-03-02"
},
"count": 2
}
]