I have the following collection which represents a swipe record when a member goes to the gym.
{
"_id" : ObjectId(""),
"content" : {
"Date_Key" : "",
"TRANSACTION_EVENT_KEY" : "",
"SITE_NAME" : "",
"Swipe_DateTime" : "",
"Gender" : "",
"Post_Out_Code" : "",
"Year_Of_Birth" : "",
"Time_Key" : "",
"MemberID_Hash" : "",
"Member_Key_Hash" : "",
"Swipes" : ""
},
"collection" : "observations"
}
I want to return the number of members for every number of gym swipes in a given month.
For example:
{
{"nrOfGymSwipes": 0, "nrOfMembers": 10}, // 10 members who swiped 0 times
{"nrOfGymSwipes": 1, "nrOfMembers": 15}, // 15 members who swiped once
{"nrOfGymSwipes": 2, "nrOfMembers": 17},
...
}
I have tried the following:
collection
.aggregate(
[{$match: {"content.Swipe_DateTime": {$regex:"201602"}}},
{$group: {_id: "$content.MemberID_Hash", "nrOfGymSwipes":{$sum: 1}}},
{$sort: {"nrOfGymSwipes": 1}}],
which returns for each member the number of swipes in the given month.
.........
{ _id: '111', nrOfGymSwipes: 16 },
{ _id: '112', nrOfGymSwipes: 16 },
{ _id: '113', nrOfGymSwipes: 17 },
...............
Now I was thinking of doing a group by the number of gym swipes and count the ids, have tried this but it doesn't return what i expected
collection
.aggregate(
[{$match: {"content.Swipe_DateTime": {$regex:"201602"}}},
{$group: {_id: "$content.MemberID_Hash", "nrOfGymSwipes":{$sum: 1}}},
{$group: {_id: "nrOfGymSwipes", "nrOfMembers":{$sum: 1}}}, <---added this
{$sort: {"nrOfGymSwipes": 1}}],
Any idea how i can solve this?
Also, is there a way to change the way i get the json output? for example instead of showing _id: "32131" part, output nrOfMembers: "312321"
You were almost there with your final group, you only needed to prefix your _id key with $ to indicate the number of swipes field. The $sort pipeline is where another problem is because the field you are trying to sort on does not exist. The aggregate pipeline works on the premise that results from a stage in the pipeline are passed on to the next as modified documents (with their own structure depending on the aggregate operation) and the last group pipeline only produces two fields, "_id" and "nrOfMembers".
You can use the $project pipeline step in order for the $sort stage to work since it creates the "nrOfGymSwipes" field for you by replacing the previous _id key and you can then get the final output in the desired structure. So you final aggregate operation should be:
collection.aggregate([
{ "$match": { "content.Swipe_DateTime": { "$regex":"201602" } } },
{ "$group": { "_id": "$content.MemberID_Hash", "nrOfGymSwipes": { "$sum": 1 } } },
{ "$group": { "_id": "$nrOfGymSwipes", "nrOfMembers": { "$sum": 1 } } },
{ "$project": { "_id": 0, "nrOfGymSwipes": "$_id", "nrOfMembers": 1 } },
{ "$sort": { "nrOfGymSwipes": 1 } }
], function (err, result) { ... });
Related
I got the following aggregation:
It scans all the messages and groups them by a docId and returns only the last updated message in each group.
db.getCollection('Messages').aggregate([ { '$match': { docType: 'order' }}, { '$sort': { updatedAt: -1 } }, { '$group': { _id: '$docId', content: { '$first': '$content' }}}])
which returns -
[
{
"_id" : "some id1",
"content" : "some msg1
}
/* 11 */
{
"_id" : "some id2",
"content" : "some msg2"
}
...
]
It is working as intended (not sure about optimization).
But now I need to add another thing on top of that.
In the UI I got a list of documents and I need to show only the latest message for each. But I also got paging so I dont need to bring the last message for XXXXXX documents but only for 1 page.
So basically something like this -
.find({'docId':{$in:['doc1', 'doc2', 'doc3'...]}}) - if the page had 3 items
But I am not sure how to combine all of that together.
Message sample:
{
"_id": "11111"
"docType": "order",
"docId": "12345", - this is not unique there can be many messages for 1 docId
"content": "my message",
"updatedAt" "01/01/2020..."
}
Adding
{ '$match': { _id: { '$in': ["docId1", "docId2"]} }}
at the end did the trick!
edit:
or actually I think It might be better to add it as the first pipeline so:
db.getCollection('Messages').aggregate([ { '$match': { docId: { '$in': ["5d79cba1-925b-416b-9408-6f4429d7c107", "8e31c748-c86d-407e-8d83-9810c8e23e3e"]} }}, { '$match': { docType: 'order' }}, { '$sort': { cAt: -1 } }, { '$group': { _id: '$docId', content: { '$first': '$content' }}}])
Since I am adding those dynamically I ended up with 2 $match properties. I actually not so sure what difference does it make to use $match + $and vs having 2 different $match (optimization wise).
Mongo query fails to return any input in case, I increase the number of group stages in my query. Below is the snippet of the query which I am using,
.group({
_id: "$date",
count: {
$sum: 1
},
})
/*
.group({
_id: "$joinDate",
count: {
$sum: 1
},
})
.group({
_id: "$applyDate",
count: {
$sum: 1
},
})*/
You can do by the following way
yourModel.aggregate([ { $group : { _id : "$date" } } ] )
This might help you to solve your problem.
#CsAlkemy, It is because the fields you are trying to use are not propagated after the $group stage.
For example, consider the following sample document and an aggregate query on the document,
{
"name": "SV",
"age": 21,
"school": "KV",
"city": "Ajmer"
}
Aggregate Query
db.temp.aggregate([
{
$group: {
"_id": "$school",
"count": {
$sum: 1
}
}
}])
Output
{ "_id" : "KV", "count" : 1 }
As you can see the fields which we get are only _id and count and the other fields name, age, school and city are lost and can't be used later.
I want to build online test application using mongoDB and nodeJS. And there is a feature which admin can view user test history (with date filter option).
How to do the query, if I want to display only user which the test results array contains date specified by admin.
The date filter will be based on day, month, year from scheduledAt.startTime, and I think I must use aggregate framework to achieve this.
Let's say I have users document like below:
{
"_id" : ObjectId("582a7b315c57b9164cac3295"),
"username" : "lalalala#gmail.com",
"displayName" : "lalala",
"testResults" : [
{
"applyAs" : [
"finance"
],
"scheduledAt" : {
"endTime" : ISODate("2016-11-15T16:00:00.000Z"),
"startTime" : ISODate("2016-11-15T01:00:00.000Z")
},
"results" : [
ObjectId("582a7b3e5c57b9164cac3299"),
ObjectId("582a7cc25c57b9164cac329d")
],
"_id" : ObjectId("582a7b3e5c57b9164cac3296")
},
{
.....
}
],
"password" : "andi",
}
testResults document:
{
"_id" : ObjectId("582a7cc25c57b9164cac329d"),
"testCategory" : "english",
"testVersion" : "EAX",
"testTakenTime" : ISODate("2016-11-15T03:10:58.623Z"),
"score" : 2,
"userAnswer" : [
{
"answer" : 1,
"problemId" : ObjectId("581ff74002bb1218f87f3fab")
},
{
"answer" : 0,
"problemId" : ObjectId("581ff78202bb1218f87f3fac")
},
{
"answer" : 0,
"problemId" : ObjectId("581ff7ca02bb1218f87f3fad")
}
],
"__v" : 0
}
What I have tried until now is like below. If I want to count total document, which part of my aggregation framework should I change. Because in query below, totalData is being summed per group not per whole returned document.
User
.aggregate([
{
$unwind: '$testResults'
},
{
$project: {
'_id': 1,
'displayName': 1,
'testResults': 1,
'dayOfTest': { $dayOfMonth: '$testResults.scheduledAt.startTime' },
'monthOfTest': { $month: '$testResults.scheduledAt.startTime' },
'yearOfTest': { $year: '$testResults.scheduledAt.startTime' }
}
},
{
$match: {
dayOfTest: date.getDate(),
monthOfTest: date.getMonth() + 1,
yearOfTest: date.getFullYear()
}
},
{
$group: {
_id: {id: '$_id', displayName: '$displayName'},
testResults: {
$push: '$testResults'
},
totalData: {
$sum: 1
}
}
},
])
.then(function(result) {
res.send(result);
})
.catch(function(err) {
console.error(err);
next(err);
});
You can try something like this. Added the project stage to keep the test results if any of result element matches on the date passed. Add this as the first step in the pipeline and you can add the grouping stage the way you want.
$map applies an equals comparison between the date passed and start date in each test result element and generates an array with true and false values. $anyElementTrue inspects this array and returns true only if there is atleast one true value in the array. Match stage to include only elements with matched value of true.
aggregate([{
"$project": {
"_id": 1,
"displayName":1,
"testResults": 1,
"matched": {
"$anyElementTrue": {
"$map": {
"input": "$testResults",
"as": "result",
"in": {
"$eq": [{ $dateToString: { format: "%Y-%m-%d", date: '$$result.scheduledAt.startTime' } }, '2016-11-15']
}
}
}
}
}
}, {
"$match": {
"matched": true
}
}])
Alternative Version:
Similar to the above version but this one combines both the project and match stage into one. The $cond with $redact accounts for match and when match is found it keeps the complete tree or else discards it.
aggregate([{
"$redact": {
"$cond": [{
"$anyElementTrue": {
"$map": {
"input": "$testResults",
"as": "result",
"in": {
"$eq": [{
$dateToString: {
format: "%Y-%m-%d",
date: '$$result.scheduledAt.startTime'
}
}, '2016-11-15']
}
}
}
},
"$$KEEP",
"$$PRUNE"
]
}
}])
I have a collection db.activities, each item of which has a dueDate. I need to present data in a following format, which basically a list of activities which are due today and this week:
{
"today": [
{ _id: 1, name: "activity #1" ... },
{ _id: 2, name: "activity #2" ... }
],
"thisWeek": [
{ _id: 3, name: "activity #3" ... }
]
}
I managed to accomplish this by simply querying for the last week's activities as a flat list and then grouping them with javascript on the client, but I suspect this is a very dirty solution and would like to do this on server.
look up mongo aggregation pipeline.
your aggregation has a match by date, group by date and a maybe a sort/order stage also by date.
lacking the data scheme it will be along the lines of
db.collection.aggregate([{ $match: {"duedate": { "$gte" : start_dt, "$lte" : end_dt} } ,
{ $group: {_id: "$duedate", recordid : "$_id" , name: "$name" },
{"$sort" : {"_id" : 1} } ] );
if you want 'all' records remove the $match or use { $match: {} } as one does with find.
in my opinion, you cannot aggregate both by day and week within one command. the weekly one may be achieved by projecting duedate using mongos $dayOfWeek. along the lines of
db.collection.aggregate([
{ $match: {"duedate": { "$gte" : start_dt, "$lte" : end_dt} } ,
{ $project : { dayOfWeek: { $dayOfWeek: "$duedate" } },
{ $group: {_id: "$dayOfWeek", recordid : "$_id" , name: "$name" },
{"$sort" : {"_id" : 1} } ] );
check out http://docs.mongodb.org/manual/reference/operator/aggregation/dayOfWeek/
This is Collection Structure
[{
"_id" : "....",
"name" : "aaaa",
"level_max_leaves" : [
{
level : "ObjectIdString 1",
max_leaves : 4,
}
]
},
{
"_id" : "....",
"name" : "bbbb",
"level_max_leaves" : [
{
level : "ObjectIdString 2",
max_leaves : 2,
}
]
}]
I need to find the subdocument value of level_max_leaves.level filter when its matching with given input value.
And this how I tried,
For example,
var empLevelId = 'ObjectIdString 1' ;
MyModel.aggregate(
{$unwind: "$level_max_leaves"},
{$match: {"$level_max_leaves.level": empLevelId } },
{$group: { "_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }}},
function (err, res) {
console.log(res);
});
But here the $match filter is not working. I can't find out exact results of ObjectIdString 1
If I filter with name field, its working fine. like this,
{$match: {"$name": "aaaa" } },
But in subdocument level its returns 0.
{$match: {"$level_max_leaves.level": "ObjectIdString 1"} },
My expected result was,
{
"_id" : "ObjectIdString 1",
"total" : 4,
}
You have typed the $match incorrectly. Fields with $ prefixes are either for the implemented operators or for "variable" references to field content. So you just type the field name:
MyModel.aggregate(
[
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$unwind": "$level_max_leaves" },
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$group": {
"_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }
}}
],
function (err, res) {
console.log(res);
}
);
Which on the sample you provide produces:
{ "_id" : "ObjectIdString 1", "total" : 4 }
It is also good practice to $match first in your pipeline. That is in fact the only time an index can be used. But not only for that, as without the initial $match statement, your aggregation pipeline would perform an $unwind operation on every document in the collection, whether it met the conditions or not.
So generally what you want to do here is
Match the documents that contain the required elements in the array
Unwind the array of the matching documents
Match the required array content excluding all others