I have the following data structure in mongodb
[
{
"id" : "unique id 1",
"timeStamp" : "timeStamp",
"topicInfo" : [
{
topic : "topic1",
offset : "offset number",
time: 1464875267637
},
{
topic : "topic2",
offset : "offset number",
time: 1464875269709
},
{
topic : "topic3",
offset : "offset number",
time : 1464875270849
}
]
},
{
"id" : "unique id 2",
"timeStamp" : "timeStamp",
"topicInfo" : [
{
topic : "15",
offset : "offset number",
time : 1464875271884
},
{
topic : "topic2",
offset : "offset number",
time : 1464875273887
},
{
topic : "topic3",
offset : "offset number",
time : 1464875272848
}
]
}
]
Now I want to find all the entry That has topic called "topic2" and the value of time is maximum compare to other object's in the "topicInfo" array. I also want to sort them by "timeStamp". From the example code the query should return the second object. I am not able to write the query any help would be much appreciated.
The optimal best way to do this is in MongoDB 3.2 or newer. We need to $project our documents and use the $filter operator to return a subset of the "topicInfo" array that matches our condition. And as of MongoDB3.2 , we can use the $max in the $project stage in the condition expression and perform a logical operation on the returned value.
The final stage in the pipeline is the $match stage where you filter out those documents with empty "topicInfo" using the $exists element query operator and the dot notation to access the first element in the array. This also reduces both the amount of data sent over the wire and the time and memory used to decode documents on the client-side.
db.collection.aggregate([
{ "$project": {
"topicInfo": {
"$filter": {
"input": "$topicInfo",
"as": "t",
"cond": {
"$and": [
{ "$eq": [ "$$t.topic", "topic2"] },
{ "$eq": [ "$$t.time", { "$max": "$topicInfo.time" } ] }
]
}
}
}
}},
{ "$match": { "topicInfo.0": { "$exists": true } } }
])
You can do it with aggregation framework like this:
db.test.aggregate(
{ $unwind: '$topicInfo' },
{ $match: { 'topicInfo.topic': 'topic2' } },
{ $group: {
_id: '$id',
timestamp: { $first: '$timestamp' },
time: { $max: '$topicInfo.time' } }
},
{ $sort: { timestamp: 1 } }).pretty()
Related
I want to perform an aggregation query that does basic pagination:
Find all orders that belongs to a certain company_id
Sort the orders by order_number
Count the total number of documents
Skips to e.g. document number 100 and passes on the rest
Limits the number of documents to e.g. 2 and passes them on
Finishes by returning the count and a selected few fields from the documents
Here is a breakdown of the query:
db.Order.collection.aggregate([
This finds all matching documents:
{ '$match' : { "company_id" : ObjectId("54c0...") } },
This sorts the documents:
{ '$sort' : { 'order_number' : -1 } },
This counts the documents and passes the unmodified documents, but I'm sure doing it wrong, because things turn weird from here:
{
'$group' : {
'_id' : null,
'count' : { '$sum' : 1 },
'entries' : { '$push' : "$$ROOT" }
}
},
This seems to skip some documents:
{ "$skip" : 100 },
This is supposed to limit the documents, but it does not:
{ "$limit" : 2 },
This does return the count, but it does not return the documents in an array, instead it returns arrays with each field:
{ '$project' : {
'count' : 1,
'entries' : {'_id' : "$entries._id", 'order_number' : "$entries.order_number"}
}
}
])
This is the result:
[
{ "_id" : null,
"count" : 300,
"entries" : [
{
"_id" : [ObjectId('5a5c...'), ObjectId('5a5c...')],
"order_number" : ["4346", "4345"]
},
{
"_id" : [ObjectId('5a5c...'), ObjectId('5a5c...')],
"order_number" : ["4346", "4345"]
},
...
]
}
]
Where do I get it wrong?
To calculate totals and return a subset, you need to apply grouping and skip/limit to the same dataset. For that you can utilise facets
For example to show 3rd page, 10 documents per page:
db.Order.aggregate([
{ '$match' : { "company_id" : ObjectId("54c0...") } },
{ '$sort' : { 'order_number' : -1 } },
{ '$facet' : {
metadata: [ { $count: "total" }, { $addFields: { page: NumberInt(3) } } ],
data: [ { $skip: 20 }, { $limit: 10 } ] // add projection here wish you re-shape the docs
} }
] )
It will return a single document with 2 fields:
{
"metadata" : [
{
"total" : 300,
"page" : 3
}
],
"data" : [
{
... original document ...
},
{
... another document ...
},
{
... etc up to 10 docs ...
}
]
}
Since mongoDB version 5.0 there is another option, that allows to avoid the disadvantage of $facet, the grouping of all returned document into a one big document. The main concern is that a document as a size limit of 16M. Using $setWindowFields allows to avoid this concern:
db.Order.aggregate([
{$match: {company_id: ObjectId("54c0...") } },
{$sort: {order_number: -1 } },
{$setWindowFields: {output: {totalCount: {$count: {}}}}}
{$skip: 20 },
{$limit: 10 }
])
Below is the sample document of a collection, say "CollectionA"
{
"_id" : ObjectId("5ec3f19225701c4f7ab11a5f"),
"workshop" : ObjectId("5ebd37a3d33055331eb4730f"),
"participant" : ObjectId("5ebd382dd33055331eb47310"),
"status" : "analyzed",
"createdBy" : ObjectId("5eb7aa24d33055331eb4728c"),
"updatedBy" : ObjectId("5eb7aa24d33055331eb4728c"),
"results" : [
{
"analyze_by" : {
"user_name" : "m",
"user_id" : "5eb7aa24d33055331eb4728c"
},
"category_list" : [
"Communication",
"Controlling",
"Leading",
"Organizing",
"Planning",
"Staffing"
],
"analyzed_date" : ISODate("2020-05-19T14:48:49.993Z"),
}
],
"summary" : [],
"isDeleted" : false,
"isActive" : true,
"updatedDate" : ISODate("2020-05-19T14:48:50.827Z"),
"createdDate" : ISODate("2020-05-19T14:47:46.374Z"),
"__v" : 0
}
I need to query all the documents to get the "results" array length and return a sum of all document's "results" length.
For example,
document 1 has "results" length - 5
document 2 has "results" length - 6
then output should be 11.
Can we write a query, instead of getting all, iterating and the adding the results length??
If I had understand clearly you would like to project the length of the result attribute.
So you should check the $size operator would work for you.
https://docs.mongodb.com/manual/reference/operator/aggregation/size/
You can use $group and $sum to calculate the total size of a field which contains the size of your results array. To create the field, You can use $size in $addFields to calculate the size of results in each document and put it the field. As below:
db.getCollection('your_collection').aggregate([
{
$addFields: {
result_length: { $size: "$results"}
}
},
{
$group: {
_id: '',
total_result_length: { $sum: '$result_length' }
}
}
])
You use an aggregation grouping query with $sum and $size aggregation operators to get the total sum of array elements size for all documents in the collection.
db.collection.aggregate( [
{
$group: {
_id: null,
total_count: { $sum: { $size: "$results" } }
}
}
] )
Aggregation using Mongoose's Model.aggregate():
SomeModel.aggregate([
{
$group: {
_id: null,
total_count: { $sum: { $size: "$results" } }
}
}
]).
then(function (result) {
console.log(result);
});
I want to build online test application using mongoDB and nodeJS. And there is a feature which admin can view user test history (with date filter option).
How to do the query, if I want to display only user which the test results array contains date specified by admin.
The date filter will be based on day, month, year from scheduledAt.startTime, and I think I must use aggregate framework to achieve this.
Let's say I have users document like below:
{
"_id" : ObjectId("582a7b315c57b9164cac3295"),
"username" : "lalalala#gmail.com",
"displayName" : "lalala",
"testResults" : [
{
"applyAs" : [
"finance"
],
"scheduledAt" : {
"endTime" : ISODate("2016-11-15T16:00:00.000Z"),
"startTime" : ISODate("2016-11-15T01:00:00.000Z")
},
"results" : [
ObjectId("582a7b3e5c57b9164cac3299"),
ObjectId("582a7cc25c57b9164cac329d")
],
"_id" : ObjectId("582a7b3e5c57b9164cac3296")
},
{
.....
}
],
"password" : "andi",
}
testResults document:
{
"_id" : ObjectId("582a7cc25c57b9164cac329d"),
"testCategory" : "english",
"testVersion" : "EAX",
"testTakenTime" : ISODate("2016-11-15T03:10:58.623Z"),
"score" : 2,
"userAnswer" : [
{
"answer" : 1,
"problemId" : ObjectId("581ff74002bb1218f87f3fab")
},
{
"answer" : 0,
"problemId" : ObjectId("581ff78202bb1218f87f3fac")
},
{
"answer" : 0,
"problemId" : ObjectId("581ff7ca02bb1218f87f3fad")
}
],
"__v" : 0
}
What I have tried until now is like below. If I want to count total document, which part of my aggregation framework should I change. Because in query below, totalData is being summed per group not per whole returned document.
User
.aggregate([
{
$unwind: '$testResults'
},
{
$project: {
'_id': 1,
'displayName': 1,
'testResults': 1,
'dayOfTest': { $dayOfMonth: '$testResults.scheduledAt.startTime' },
'monthOfTest': { $month: '$testResults.scheduledAt.startTime' },
'yearOfTest': { $year: '$testResults.scheduledAt.startTime' }
}
},
{
$match: {
dayOfTest: date.getDate(),
monthOfTest: date.getMonth() + 1,
yearOfTest: date.getFullYear()
}
},
{
$group: {
_id: {id: '$_id', displayName: '$displayName'},
testResults: {
$push: '$testResults'
},
totalData: {
$sum: 1
}
}
},
])
.then(function(result) {
res.send(result);
})
.catch(function(err) {
console.error(err);
next(err);
});
You can try something like this. Added the project stage to keep the test results if any of result element matches on the date passed. Add this as the first step in the pipeline and you can add the grouping stage the way you want.
$map applies an equals comparison between the date passed and start date in each test result element and generates an array with true and false values. $anyElementTrue inspects this array and returns true only if there is atleast one true value in the array. Match stage to include only elements with matched value of true.
aggregate([{
"$project": {
"_id": 1,
"displayName":1,
"testResults": 1,
"matched": {
"$anyElementTrue": {
"$map": {
"input": "$testResults",
"as": "result",
"in": {
"$eq": [{ $dateToString: { format: "%Y-%m-%d", date: '$$result.scheduledAt.startTime' } }, '2016-11-15']
}
}
}
}
}
}, {
"$match": {
"matched": true
}
}])
Alternative Version:
Similar to the above version but this one combines both the project and match stage into one. The $cond with $redact accounts for match and when match is found it keeps the complete tree or else discards it.
aggregate([{
"$redact": {
"$cond": [{
"$anyElementTrue": {
"$map": {
"input": "$testResults",
"as": "result",
"in": {
"$eq": [{
$dateToString: {
format: "%Y-%m-%d",
date: '$$result.scheduledAt.startTime'
}
}, '2016-11-15']
}
}
}
},
"$$KEEP",
"$$PRUNE"
]
}
}])
This is Collection Structure
[{
"_id" : "....",
"name" : "aaaa",
"level_max_leaves" : [
{
level : "ObjectIdString 1",
max_leaves : 4,
}
]
},
{
"_id" : "....",
"name" : "bbbb",
"level_max_leaves" : [
{
level : "ObjectIdString 2",
max_leaves : 2,
}
]
}]
I need to find the subdocument value of level_max_leaves.level filter when its matching with given input value.
And this how I tried,
For example,
var empLevelId = 'ObjectIdString 1' ;
MyModel.aggregate(
{$unwind: "$level_max_leaves"},
{$match: {"$level_max_leaves.level": empLevelId } },
{$group: { "_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }}},
function (err, res) {
console.log(res);
});
But here the $match filter is not working. I can't find out exact results of ObjectIdString 1
If I filter with name field, its working fine. like this,
{$match: {"$name": "aaaa" } },
But in subdocument level its returns 0.
{$match: {"$level_max_leaves.level": "ObjectIdString 1"} },
My expected result was,
{
"_id" : "ObjectIdString 1",
"total" : 4,
}
You have typed the $match incorrectly. Fields with $ prefixes are either for the implemented operators or for "variable" references to field content. So you just type the field name:
MyModel.aggregate(
[
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$unwind": "$level_max_leaves" },
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$group": {
"_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }
}}
],
function (err, res) {
console.log(res);
}
);
Which on the sample you provide produces:
{ "_id" : "ObjectIdString 1", "total" : 4 }
It is also good practice to $match first in your pipeline. That is in fact the only time an index can be used. But not only for that, as without the initial $match statement, your aggregation pipeline would perform an $unwind operation on every document in the collection, whether it met the conditions or not.
So generally what you want to do here is
Match the documents that contain the required elements in the array
Unwind the array of the matching documents
Match the required array content excluding all others
My structure.
User:
{
name: "One",
favoriteWorkouts: [ids of workouts],
workouts: [ { name: "My workout 1" },...]
}
I want to get list of favorits/hottest workouts from database.
db.users.aggregate(
{ $unwind : "$favorite" },
{ $group : { _id : "$favorite" , number : { $sum : 1 } } },
{ $sort : { number : -1 } }
)
This returns
{
"hot": [
{
"_id": "521f6c27145c5d515f000006",
"number": 1
},
{
"_id": "521f6c2f145c5d515f000007",
"number": 1
},...
]}
But I want
{
hot: [
{object of hottest workout 1, object of hottest workout 2,...}
]}
How do you sort hottest data and fill the result with object, not just ids?
You are correct to want to use MongoDB's aggregation framework. Aggregation will give you the output you are looking for if used correctly. If you are looking for just a list of the _id's of all users' favorite workouts, then I believe that you would need to add an additional $group operation to your pipeline:
db.users.aggregate(
{ $unwind : "$favoriteWorkouts" },
{ $group : { _id : "$favoriteWorkouts", number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $group : { _id : "oneDocumentWithWorkoutArray", hot : { $push : "$_id" } } }
)
This will yield a document of the following form, with the workout ids listed by popularity:
{
"_id" : "oneDocumentWithWorkoutArray",
"hot" : [
"workout6",
"workout1",
"workout5",
"workout4",
"workout3",
"workout2"
]
}