I'm using nodeJS and I have a MongoDB table, in which there are documents with my self-created field timestamp. Example for this field in document: timestamp: "24-09-2020 08:25:55.301". I'd like to extract data off the table, of documents counter for each day. Meaning I'd like to get how many documents were in each past week day (using the timestamp field).
The perfect data I'd like to get from the query is an array of size 7, And each array item describes how many documents found for this date.
How can I do it?
You should never store date/time values as string, use always proper Date object.
One solution is this one:
db.collection.aggregate([
{
$set: {
timestamp: {
$dateFromString: {
dateString: "$timestamp", format: "%d-%m-%Y %H:%M:%S.%L", timezone: "Europe/Zurich"
}
}
}
},
{ $match: { timestamp: { $gte: moment.tz('Europe/Zurich').subtract(7, 'days').startOf('day').toDate() } } },
{
$group: {
_id: {
$dateFromParts: {
year: { $year: { date: "$timestamp", timezone: "Europe/Zurich" } },
month: { $month: { date: "$timestamp", timezone: "Europe/Zurich" } },
day: { $dayOfMonth: { date: "$timestamp", timezone: "Europe/Zurich" } },
timezone: "Europe/Zurich"
}
},
documents: { $sum: 1 }
}
},
{ $project: { timestamp: "$_id", documents: 1, _id: 0 } }
])
Related
I'm trying to group Orders array within the time interval and group them by date, sort them, and have all orders split into days. However, I'd like to sort the orders array on each day sorted within each day as well.
Orders.aggregate([{
$match: {
'created_at': {
$gte: timeInterval, //match last 3 days
}
}
}, {
"$group": {
_id: {
$dateToString: { format: "%d/%m/%Y", date: "$created_at", timezone: "Australia/Melbourne" } // group by date
},
orders: { $push: '$$ROOT' }
}
}, {
$sort: { '_id': -1 } // sort by date
}])
const OrderSchema = new Schema({
...,
created_at: {
type: Date,
default: Date.now
},
})
exData=[{_id: 26/12/2021, orders:[...]},{_id: 25/12/2021, orders:[...]}, ]
// need to sort orders array
I've tried:
{
$sort: { '_id': -1, "orders.created_at": -1 }
}
I would appreciate any help. Thanks.
Your aggregation pipeline should be looked as below:
$match - Filter documents by condition.
$sort - Sort documents by created_at.
$group - Group by created_at and add orders for the accumulated field.
Orders.aggregate([
{
$match: {
'created_at': {
$gte: timeInterval, //match last 3 days
}
}
},
{
$sort: { "created_at": -1 }
},
{
$group: {
_id: {
$dateToString: { format: "%d/%m/%Y", date: "$created_at", timezone: "Australia/Melbourne" } // group by date
},
orders: { $push: '$$ROOT' }
}
}
])
I have the below code snippet which will retrieve Date and count from a MongoDb collection from a specific date. Example: Retrieve date, count from 05-05-2020.
||Date||Count||
|05-06-2020|4|
|05-07-2020|25| and so on.
i want to add another logic to retrieve aggregate sum of 7 days instead of individual dates. Appreciate any help.
mongoClient.db().collection(COLLECTION.AUDIT).aggregate([
{
$match: { created_at: { $gt: date } }
},
{
$group: {
_id: {
$dateToString: { format: "%Y-%m-%d", date: "$created_at" }
},
count: { $sum: 1 }
}
},
{
$sort: { "_id": -1 }
}
])
The simplest way to do what I think you're asking would be to transform your group operator to $week instead of $dateToString. Since a week is 7 days, this will group all the documents from the same week, and return a count of the documents, along with the number of the week. To get both results from 1 query, combine them into a facet. So:
mongoClient.db().collection(COLLECTION.AUDIT).aggregate([
{
$match: { created_at: { $gt: date } }
},
{
$facet: {
by_week: {
$group: {
_id: { $week: $created_at},
count: { $sum: 1 }
},
{ $sort: { "_id": -1 }}
},
by_day: {
$group: {
_id: {
$dateToString: { format: "%Y-%m-%d", date: "$created_at" }
},
count: { $sum: 1 }
}
},
{ $sort: { "_id": -1 }}
}
},
])
Given a collection looking like this
{ "uuid" : "32645503-0d51-4fc6-8e4a-28b03714db5e", "date" : "1593701241361" }
{ "uuid" : "63022d8b-e16e-4387-b5c0-74d09a95fb72", "date" : "1593787041640" }
where date is a date in the format of a number, what would the query look like to get all documents with the same day? It should be very fast/efficient, because my collection has more than 100 000 documents.
You can use $toDate and $dateToString mongodb operators to achieve what you want.
Here is the query which finds all the duplicates:
collection.aggregate([{
// This stage converts dateNumber to dateObject
$addFields: {
dateString: { $toDate: "$date" }
}
}, {
// This stage converts dateObject to date day
$addFields: {
day: { $dateToString: { format: "%Y-%m-%d", date: "$dateString" } }
}
}, {
// This stage groups all the records with same day
$group: {
_id: { day: "$day" },
uniqueRecords: { $addToSet: "$_uuid" },
count: { $sum: 1 }
}
}, {
// This stage finds groups which have more than 1 record. (Which means duplication)
$match: {
count: { $gt: 1 }
}
}])
You can use moment to convert them into a date format.
let epoch = 1593787041640;
let result = moment(epoch).format('DD/MM/YY');
Then you can iterate and compare two strings.
and get matching data in return.
I have a query that is returning the total number of entries in a collection per year-month, grouped by a location.
This is returning data exactly as I need it if the location has results for the year-month in question.
However, is it possible to insert an entry for a month that does not have a result? For instance lets say if my $match has a date range of 01-2019 to 12-2019. I would like to have all 12 entries for the month with a default of total: 0.
Truncated Schema :
{
branchId: { type: String, required: true },
orgId: { type: String, required: true },
stars: { type: Number, default: 0 },
reviewUpdatedAt: { type: Date, default: Date.now }
}
What I've tried:
[
{
$match: {
stars: { $exists: true, $gte: 1 },
orgId: '100003'
reviewUpdatedAt: { $gte: new Date(fromDate), $lte: new Date(toDate) }
}
},
{
$group: {
_id: {
date: {
$dateToString: {
format: "%m-%Y",
date: "$reviewUpdatedAt"
}
},
loc: "$branchId"
},
total: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.loc",
reviews: {
$push: {
total: "$total",
"date": "$_id.date"
}
}
}
}
]
Starting in Mongo 5.1, it's a perfect use case for the new $densify aggregation operator:
// { date: "02-2019", value: 12 }
// { date: "03-2019", value: 2 }
// { date: "11-2019", value: 3 }
db.collection.aggregate([
{ $set: {
date: { $dateFromString: { // "02-2019" => ISODate("2019-04-01")
dateString: { $concat: [ "01-", "$date" ] },
format: "%d-%m-%Y"
}}
}},
{ $densify: {
field: "date",
range: {
step: 1,
unit: "month",
bounds: [ISODate("2019-01-01"), ISODate("2020-01-01")]
}
}},
{ $set: {
value: { $cond: [ { $not: ["$value"] }, 0, "$value" ] },
date: { $dateToString: { format: "%m-%Y", date: "$date" } } // ISODate("2019-04-01") => "02-2019"
}}
])
// { date: "01-2019", value: 0 }
// { date: "02-2019", value: 12 }
// { date: "03-2019", value: 2 }
// { date: "04-2019", value: 0 }
// { date: "05-2019", value: 0 }
// { date: "06-2019", value: 0 }
// { date: "07-2019", value: 0 }
// { date: "08-2019", value: 0 }
// { date: "09-2019", value: 0 }
// { date: "10-2019", value: 0 }
// { date: "11-2019", value: 3 }
// { date: "12-2019", value: 0 }
This:
casts date strings into dates (the first $set stage)
densifies documents ($densify) by creating new documents in a sequence of documents where certain values for a field (in our case field: "date") are missing:
the step for our densification is 1 month: range: { step: 1, unit: "month", ... }
and we densify within the range of dates provided with bounds: [ISODate("2019-01-01"), ISODate("2020-01-01")]
sets dates back to date strings: date: { $dateToString: { format: "%m-%Y", date: "$date" } }
and also sets ($set) views to 0 only for new documents included during the densify stage ({ value: { $cond: [ { $not: ["$value"] }, 0, "$value" ] })
At first I thought this can be easily achieved through code, but even with MongoDB you can do that but with an input from code :
Let's say if your fromDate is June-2018 & toDate is June-2019, then by using your programming language you can easily get all months between those two dates in this format mm-yyyy. You can try to do this using MongoDB but I would rather prefer as an input to query.
Query :
db.collection.aggregate([
{
$group: {
_id: {
date: {
$dateToString: {
format: "%m-%Y",
date: "$reviewUpdatedAt"
}
},
loc: "$branchId"
},
Total: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.loc",
reviews: {
$push: {
Total: "$Total",
"date": "$_id.date"
}
}
}
},
/** Overwrite existing reviews field with new array, So forming new array ::
* as you're passing all months between these dates get a difference of two arrays (input dates - existing dates after group)
* while will leave us with an array of missing dates, we would iterate on that missing dates array &
* concat actual reviews array with each missing date
* */
{
$addFields: {
reviews: {
$reduce: {
input: {
$setDifference: [
[
"06-2018",
"07-2018",
"08-2018",
"09-2018",
"10-2018",
"11-2018",
"12-2018",
"01-2019",
"02-2019",
"03-2019",
"04-2019",
"05-2019",
"06-2019"
],
"$reviews.date"
]
},
initialValue: "$reviews",
in: {
$concatArrays: [
"$$value",
[
{
date: "$$this",
Total: 0
}
]
]
}
}
}
}
}
])
Test : MongoDB-Playground
Ref : javascript-get-all-months-between-two-dates
so step back and realize you seek a display of data that doesn't exist in the db...let's say there is no data for 3/19. this is not a mongo issue but universal for any db. one creates a 'time table' in your case perhaps it is month/year...and for mongo it is documents/collection...this provides framework data for each month for the initial match..to which one's join ($lookup in mongo) will have null for 3/19...
adding a time table is standard in analytic apps -some come with that feature embedded as part of their time based analytics feature so the database doesn't need to do anything.....but to do so via general query/reporting in mongo and sql databases one would need to manually add that time collection/table
I am trying to do a per-day aggregation in MongoDB. I already have an aggregation where I successfully group the data by day. However, I want to do the aggregation in such a way where days with no data show up, but empty. That is, they are empty bins.
Below is what I have so far. I have not been able to find anything in the MongoDB documentation or otherwise that suggests how to do aggregations and produce empty bins:
app.models.profile_view.aggregate(
{ $match: { user: req.user._id , 'viewing._type': 'user' } },
{ $project: {
day: {'$dayOfMonth': '$start'},month: {'$month':'$start'},year: {'$year':'$start'},
duration: '$duration'
} },
{ $group: {
_id: { day:'$day', month:'$month', year:'$year' },
count: { $sum: 1 },
avg_duration: { $avg: '$duration' }
} },
{ $project: { _id: 0, date: '$_id', count: 1, avg_duration: 1 }}
).exec().then(function(time_series) {
console.log(time_series)
return res.send(200, [{ key: 'user', values: time_series }])
}, function(err) {
console.log(err.stack)
return res.send(500, { error: err, code: 200, message: 'Failed to retrieve profile view data' })
})
I don't think you will be able to solve this problem using aggregation. When you use $group, mongo can only group based on the data you are providing it. In this case, how would mongo know which date values are missing or even what the range of acceptable dates is?
I think your best option would be to add the missing date values to the result of your aggregation.
Starting in Mongo 5.1, it's a perfect use case for the new $densify aggregation operator:
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-05") }
// { date: ISODate("2021-12-03") }
// { date: ISODate("2021-12-07") }
db.collection.aggregate([
{ $group: {
_id: { $dateTrunc: { date: "$date", unit: "day" } },
total: { $count: {} }
}},
// { _id: ISODate("2021-12-03"), total: 1 }
// { _id: ISODate("2021-12-05"), total: 2 }
// { _id: ISODate("2021-12-07"), total: 1 }
{ $densify: { field: "_id", range: { step: 1, unit: "day", bounds: "full" } } },
// { _id: ISODate("2021-12-03"), total: 1 }
// { _id: ISODate("2021-12-04") }
// { _id: ISODate("2021-12-05"), total: 2 }
// { _id: ISODate("2021-12-06") }
// { _id: ISODate("2021-12-07"), total: 1 }
{ $project: {
day: "$_id",
_id: 0,
total: { $cond: [ { $not: ["$total"] }, 0, "$total" ] }
}}
])
// { day: ISODate("2021-12-03"), total: 1 }
// { day: ISODate("2021-12-04"), total: 0 }
// { day: ISODate("2021-12-05"), total: 2 }
// { day: ISODate("2021-12-06"), total: 0 }
// { day: ISODate("2021-12-07"), total: 1 }
This:
$groups documents by day with their $count
$dateTrunc truncates your dates at the beginning of their day (the truncation unit).
$densifies documents ($densify) by creating new documents in a sequence of documents where certain values for a field (in our case field: "_id") are missing:
the step for our densification is 1 day: range: { step: 1, unit: "day" }
finally transforms ($project) fields:
renames _id to day
add the total field for new documents included during the densify stage ({ views: { $cond: [ { $not: ["$views"] }, 0, "$views" ] })