Given that I have a document of the following structure:
{
selectedId: ObjectId("57b5fb2d7b41dde99009bc75"),
children: [
{_id: ObjectId("57b5fb2d7b41dde99009bc75"), val: 10},
{_id: ObjectId("57b5fb2d7b41dde99009bc75"), val: 20},
]
}
where the parent value "selectedId" always references one of the children Id, how do I just get the child subdocument where _id = selectedId?
I attempted:
parentModel.findOne({'selectedId': 'this.children._id'})
however, as I now know, the second string is taken as a literal. So how do I reference the parent's field in the query?
Edit: obviously, this could be done with two queries, getting the parent's "selectedId" value and then querying again. However, I want to do this in a single query.
You could use the aggregation framework, in particular leverage the $arrayElemAt and $filter operators to return the child subdocument. The following example shows this:
parentModel.aggregate([
{
"$project": {
"children": {
"$arrayElemAt": [
{
"$filter": {
"input": "$children",
"as": "item",
"cond": {
"$eq": ["$$item._id", "$selectedId"]
}
}
}, 0
]
}
}
}
]).exec(callback);
Related
I have a group collection that has the array order that contains ids.
I would like to use updateOne to set multiple items in that order array.
I tried this which updates one value in the array:
db.groups.updateOne({
_id: '831e0572-0f04-4d84-b1cf-64ffa9a12199'
},
{$set: {'order.0': 'b6386841-2ff7-4d90-af5d-7499dd49ca4b'}}
)
That correctly updates (or sets) the array value with index 0.
However, I want to set more array values and updateOne also supports a pipeline so I tried this:
db.slides.updateOne({
_id: '831e0572-0f04-4d84-b1cf-64ffa9a12199'
},
[
{$set: {'order.0': 'b6386841-2ff7-4d90-af5d-7499dd49ca4b1'}}
]
)
This does NOTHING if the order array is empty. But if it's not, it replaces every element in the order array with an object { 0: 'b6386841-2ff7-4d90-af5d-7499dd49ca4b1' }.
I don't understand that behavior.
In the optimal case I would just do
db.slides.updateOne({
_id: '831e0572-0f04-4d84-b1cf-64ffa9a12199'
},
[
{$set: {'order.0': 'b6386841-2ff7-4d90-af5d-7499dd49ca4b1'}},
{$set: {'order.1': 'otherid'}},
{$set: {'order.2': 'anotherone'}},
]
)
And that would just update the order array with the values.
What is happening here and how can I achieve my desired behavior?
The update by index position in the array is only supported in regular update queries, but not in aggregation queries,
They have explained this feature in regular update query $set operator documentation, but not it aggregation $set.
The correct implementation in regular update query:
db.slides.updateOne({
_id: '831e0572-0f04-4d84-b1cf-64ffa9a12199'
},
{
$set: {
'order.0': 'b6386841-2ff7-4d90-af5d-7499dd49ca4b1',
'order.1': 'otherid',
'order.2': 'anotherone'
}
}
)
If you are looking for only an aggregation query, it is totally long process than the above regular update query, i don't recommend that way instead, you can format your input in your client-side language and use regular query.
If you have to use aggregation framework, try this (you will have to pass array of indexes and array of updated values separately):
$map and $range to iterate over the order array by indexes
$cond and $arrayElemAt to check if the current index is in the array of indexes that has to be updates. If it is, update it with the same index from the array of new values. If it is not, keep the current value.
NOTE: This will work only if the array of indexes that you want to update starts from 0 and goes up (as in your example).
db.collection.update({
_id: '831e0572-0f04-4d84-b1cf-64ffa9a12199'
},
[
{
"$set": {
"order": {
"$map": {
input: {
$range: [
0,
{
$size: "$order"
}
]
},
in: {
$cond: [
{
$in: [
"$$this",
[
0,
1,
2
]
]
},
{
$arrayElemAt: [
[
"b6386841-2ff7-4d90-af5d-7499dd49ca4b1",
"otherid",
"anotherone"
],
"$$this"
]
},
{
$arrayElemAt: [
"$order",
"$$this"
]
}
]
}
}
}
}
}
])
Here is the working example: https://mongoplayground.net/p/P4irM9Ouyza
I am new to using mongodb and mongoose for my backend stack and Im having a hard time getting from SQL to NoSQL when it comes to query building.
I have an array of object that looks like this:
{
timestamp: "12313113",
symbol: "XY",
amount: 121212
value: 24324234
}
I want to query the collection to get the following output grouped by symbol:
{
symbol: xy,
occurences: 1231
summedAmount: 2131231
summedValue: 23131313
}
Could anyone tell me how to do it using aggregate on the Model? My timestamp filtering works already, but the grouping throws errors
let result = await TransactionEvent.aggregate([
{
$match : {
timestamp : { $gte: new Date(Date.now() - INTERVALS[timeframe]) }
}
},
{
$group : {
what to do in here
}
]);
Lets say I have another field in my object with a key of "direction" that can either be "IN" our "OUT". How could I also group the occurences of these values?
Expected output
{
symbol: xy,
occurences: 1231
summedAmount: 2131231
summedValue: 23131313
in: occurrences where direction property is "IN"
out: occurences where direction property is "OUT"
}
In MongoDB's $group stage, the _id key is mandatory and
it should be the keys which you want to be merged (It's symbol in your case).
Make sure that you pre-fix it with a `$ sign since you are referencing a key in your document.
Following the _id key, you can add all the additional operations to be performed for the required keys. In your specific use case, use $sum to add values to the user-defined key.
Note: Use "$sum": 1 to add 1 for each occurences ans "$sum": "$<Key-Name>" to add existing key's value.
Below code should be your $group stage
{
"$group": {
"_id": "$symbol", // Group by key (Use Sub-Object to group by multiple keys
"occurences": {"$sum": 1}, // Add `1` for each occurences
"summedAmount": {"$sum": "$amount"}, // Add `amount` values of grouped data
"summedValue": {"$sum": "$value"}, // Add `value` values of grouped data
}
}
Comment if you have any additional doubts.
You use $group and $sum
db.collection.aggregate([
{
"$group": {
"_id": "$symbol",
"summbedAmount": {
"$sum": "$amount"
},
"summbedValue": {
"$sum": "$value"
},
"occurences": {
$sum: 1
}
}
}
])
Working Mongo playground
Update 1
you can use $cond to check condition.
First parameter what is the condition
Second parameter - what we need to do if the condition is true (We need to increase by 1 if condition true)
Third parameter - what we need to do if the condition is false (No need to increase anything)
Here is the code
db.collection.aggregate([
{
"$group": {
"_id": "$symbol",
"summbedAmount": { "$sum": "$amount" },
"summbedValue": { "$sum": "$value" },
"occurences": { $sum: 1 },
in: {
$sum: {
$cond: [ { $eq: [ "$direction", "in" ] }, 1, 0 ]
}
},
out: {
$sum: {
$cond: [ { $eq: [ "$direction", "out" ] }, 1, 0 ] }
}
}
}
])
Working Mongo playground
I have a mongoose schema that is structured like this:
Schema E = {
_id,
... some fields,
details: [
{
...somefields,
people: [ObjectIds]
}
]
}
First, I have an aggregate query where I am using $geoNear then $match, and then $facet.
After the operations the document that I get is as follows:
estates: [
{
_id,
... some fields,
details: [
{
...somefields,
people: [ObjectIds]
}
],
... other fields
},
... more estate objects
]
],
page: [...some objects]
I have an array called approved which has some object Ids.
I want to filter the page array inside events.details while keeping the rest of the fields intact.
The result I want is as follows:
NOTE: *The field filteredPeople is the array I want after filtering people with approved.
estate: [
{
_id,
... some fields,
details: [
{
...somefields,
filteredPeople: [ObjectIds],
numberOfPeople: Size of people array
}
],
... other fields
},
... more estate objectes
],
page: [...some objects]
This is what I tried doing:
{
"estates": {
"$map": {
"input": "$estates",
"as": "estate",
"in": {
"details": {
"$map": {
"input": "$$estate.details",
"as": "detail",
"in": {
"filteredPeople": {
"$filter": {
"input": "$$detail.people",
"as": "people",
"cond": { "$in": ["$$people", approved] }
}
}
}
}
}
}
}
},
}
But this erases the other fields. The other way is to create a separate field called estatePeople where the result of the $addFields will be stored.
I could then try to merge the two arrays. But I don't have any field to match them as the second estatePoeple array will not have anything but the filteredPeople. So I will then somehow have to merge the two arrays just by the index of the array and where they appear.
Can someone please help me out on how to get the desired document with relatively good performance?
For anyone who has the same problem:
In the end, I was unable to find any way to execute the query that I wanted with reasonable performance.
This schema design is not the optimal way to execute such complicated queries. What I ended up doing was making the details array an object and have separate documents for separate details. And then I made a parent schema that kept reference of the details for the same estate.
You can use reverse referencing or referencing according to the queries that you want to execute.
Is there a way to specify a heterogeneous array as a schema property where it can contain both ObjectIds and strings? I'd like to have something like the following:
var GameSchema = new mongoose.schema({
players: {
type: [<UserModel reference|IP address/socket ID/what have you>]
}
Is the only option a Mixed type that I manage myself? I've run across discriminators, which look somewhat promising, but it looks like it only works for subdocuments and not references to other schemas. Of course, I could just have a UserModel reference and create a UserModel that just stores the IP address or whatever I'm using to identify them, but that seems like it could quickly get hugely out of control in terms of space (having a model for every IP I come across sounds bad).
EDIT:
Example:
A game has one logged in user, three anonymous users, the document should look something like this:
{ players: [ ObjectId("5fd88ea85...."), "192.0.0.1", "192.1.1.1", "192.2.2.1"] }
Ideally this would be populated to:
{ players: [ UserModel(id: ..., name: ...), "192.0.0.1", "192.1.1.1", "192.2.2.1"] }
EDIT:
I've decided to go a different route: instead of mixing types, I'm differentiating with different properties. Something like this:
players: [
{
user: <object reference>,
sessionID: <string>,
color: {
type: String
},
...other properties...
}
]
I have a validator that ensures only one of user or sessionID are populated for a given entry. In some ways this is more complex, but it does obviate the need to do this kind of conditional populating and figuring out what type each entry is when iterating over them. I haven't tried any of the answers, but they look promising.
If you are content to go with using Mixed or at least some scheme that will not work with .populate() then you can shift the "join" responsibility to the "server" instead using the $lookup functionality of MongoDB and a little fancy matching.
For me if I have a "games" collection document like this:
{
"_id" : ObjectId("5933723c886d193061b99459"),
"players" : [
ObjectId("5933723c886d193061b99458"),
"10.1.1.1",
"10.1.1.2"
],
"__v" : 0
}
Then I send the statement to the server to "join" with the "users" collection data where an ObjectId is present like this:
Game.aggregate([
{ "$addFields": {
"users": {
"$filter": {
"input": "$players",
"as": "p",
"cond": { "$gt": [ "$$p", {} ] }
}
}
}},
{ "$lookup": {
"from": "users",
"localField": "users",
"foreignField": "_id",
"as": "users"
}},
{ "$project": {
"players": {
"$map": {
"input": "$players",
"as": "p",
"in": {
"$cond": {
"if": { "$gt": [ "$$p", {} ] },
"then": {
"$arrayElemAt": [
{ "$filter": {
"input": "$users",
"as": "u",
"cond": { "$eq": [ "$$u._id", "$$p" ] }
}},
0
]
},
"else": "$$p"
}
}
}
}
}}
])
Which gives the result when joined to the users object as:
{
"_id" : ObjectId("5933723c886d193061b99459"),
"players" : [
{
"_id" : ObjectId("5933723c886d193061b99458"),
"name" : "Bill",
"__v" : 0
},
"10.1.1.1",
"10.1.1.2"
]
}
So the "fancy" part really relies on this logical statement when considering the entries in the "players" array:
"$filter": {
"input": "$players",
"as": "p",
"cond": { "$gt": [ "$$p", {} ] }
}
How this works is that to MongoDB, an ObjectId and actually all BSON types have a specific sort precedence. In this case where the data is "Mixed" between ObjectId and String then the "string" values are considered "less than" the value of a "BSON Object", and the ObjectId values are "greater than".
This allows you to separate the ObjectId values from the source array into their own list. Given that list, you $lookup to perform the "join" at get the objects from the other collection.
In order to put them back, I'm using $map to "transpose" each element of the original "players" where the matched ObjectId was found with the related object. An alternate approach would be to "split" the two types, do the $lookup and $concatArrays between the Users and the "strings". But that would not maintain the original array order, so $map may be a better fit.
I will add of note that the same basic process can be applied in a "client" operation by similarly filtering the content of the "players" array to contain just the ObjectId values and then calling the "model" form of .populate() from "inside" the response of the initial query. The documentation shows an example of that form of usage, as do some answers on this site before it was possible to do a "nested populate" with mongoose.
The other point of mind here is that .populate() itself existed as a mongoose method long before the $lookup aggregation pipeline operator came about, and was a solution for a time when MongoDB itself was incapable of performing a "join" of any sort. So the operations are indeed "client" side as an emulation and really only perform additional queries that you do not need to be aware of in issuing the statements yourself.
Therefore it should generally be desirable in a modern scenario to use the "server" features, and avoid the overhead involved with multiple queries in order to get the result.
I'm fairly good with sql queries, but I can't seem to get my head around grouping and getting sum of mongo db documents,
With this in mind, I have a job model with schema like below :
{
name: {
type: String,
required: true
},
info: String,
active: {
type: Boolean,
default: true
},
all_service: [
price: {
type: Number,
min: 0,
required: true
},
all_sub_item: [{
name: String,
price:{ // << -- this is the price I want to calculate
type: Number,
min: 0
},
owner: {
user_id: { // <<-- here is the filter I want to put
type: Schema.Types.ObjectId,
required: true
},
name: String,
...
}
}]
],
date_create: {
type: Date,
default : Date.now
},
date_update: {
type: Date,
default : Date.now
}
}
I would like to have a sum of price column, where owner is present, I tried below but no luck
Job.aggregate(
[
{
$group: {
_id: {}, // not sure what to put here
amount: { $sum: '$all_service.all_sub_item.price' }
},
$match: {'not sure how to limit the user': given_user_id}
}
],
//{ $project: { _id: 1, expense: 1 }}, // you can only project fields from 'group'
function(err, summary) {
console.log(err);
console.log(summary);
}
);
Could someone guide me in the right direction. thank you in advance
Primer
As is correctly noted earlier, it does help to think of an aggregation "pipeline" just as the "pipe" | operator from Unix and other system shells. One "stage" feeds input to the "next" stage and so on.
The thing you need to be careful with here is that you have "nested" arrays, one array within another, and this can make drastic differences to your expected results if you are not careful.
Your documents consist of an "all_service" array at the top level. Presumably there are often "multiple" entries here, all containing your "price" property as well as "all_sub_item". Then of course "all_sub_item" is an array in itself, also containg many items of it's own.
You can think of these arrays as the "relations" between your tables in SQL, in each case a "one-to-many". But the data is in a "pre-joined" form, where you can fetch all data at once without performing joins. That much you should already be familiar with.
However, when you want to "aggregate" accross documents, you need to "de-normalize" this in much the same way as in SQL by "defining" the "joins". This is to "transform" the data into a de-normalized state that is suitable for aggregation.
So the same visualization applies. A master document's entries are replicated by the number of child documents, and a "join" to an "inner-child" will replicate both the master and initial "child" accordingly. In a "nutshell", this:
{
"a": 1,
"b": [
{
"c": 1,
"d": [
{ "e": 1 }, { "e": 2 }
]
},
{
"c": 2,
"d": [
{ "e": 1 }, { "e": 2 }
]
}
]
}
Becomes this:
{ "a" : 1, "b" : { "c" : 1, "d" : { "e" : 1 } } }
{ "a" : 1, "b" : { "c" : 1, "d" : { "e" : 2 } } }
{ "a" : 1, "b" : { "c" : 2, "d" : { "e" : 1 } } }
{ "a" : 1, "b" : { "c" : 2, "d" : { "e" : 2 } } }
And the operation to do this is $unwind, and since there are multiple arrays then you need to $unwind both of them before continuing any processing:
db.collection.aggregate([
{ "$unwind": "$b" },
{ "$unwind": "$b.d" }
])
So there the "pipe" first array from "$b" like so:
{ "a" : 1, "b" : { "c" : 1, "d" : [ { "e" : 1 }, { "e" : 2 } ] } }
{ "a" : 1, "b" : { "c" : 2, "d" : [ { "e" : 1 }, { "e" : 2 } ] } }
Which leaves a second array referenced by "$b.d" to further be de-normalized into the the final de-normalized result "without any arrays". This allows other operations to process.
Solving
With just about "every" aggregation pipeline, the "first" thing you want to do is "filter" the documents to only those that contain your results. This is a good idea, as especially when doing operations such as $unwind, then you don't want to be doing that on documents that do not even match your target data.
So you need to match your "user_id" at the array depth. But this is only part of getting the result, since you should be aware of what happens when you query a document for a matching value in an array.
Of course, the "whole" document is still returned, because this is what you really asked for. The data is already "joined" and we haven't asked to "un-join" it in any way.You look at this just as a "first" document selection does, but then when "de-normalized", every array element now actualy represents a "document" in itself.
So not "only" do you $match at the beginning of the "pipeline", you also $match after you have processed "all" $unwind statements, down to the level of the element you wish to match.
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// De-normalize arrays
{ "$unwind": "$all_service" },
{ "$unwind": "$all_service.all_subitem" },
// Match again to filter the array elements
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_service.all_sub_item.price" }
}}
],
function(err,results) {
}
)
Alternately, modern MongoDB releases since 2.6 also support the $redact operator. This could be used in this case to "pre-filter" the array content before processing with $unwind:
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Filter arrays for matches in document
{ "$redact": {
"$cond": {
"if": {
"$eq": [
{ "$ifNull": [ "$owner", given_user_id ] },
given_user_id
]
},
"then": "$$DESCEND",
"else": "$$PRUNE"
}
}},
// De-normalize arrays
{ "$unwind": "$all_service" },
{ "$unwind": "$all_service.all_subitem" },
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_service.all_sub_item.price" }
}}
],
function(err,results) {
}
)
That can "recursively" traverse the document and test for the condition, effectively removing any "un-matched" array elements before you even $unwind. This can speed things up a bit since items that do not match would not need to be "un-wound". However there is a "catch" in that if for some reason the "owner" did not exist on an array element at all, then the logic required here would count that as another "match". You can always $match again to be sure, but there is still a more efficient way to do this:
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Filter arrays for matches in document
{ "$project": {
"all_items": {
"$setDifference": [
{ "$map": {
"input": "$all_service",
"as": "A",
"in": {
"$setDifference": [
{ "$map": {
"input": "$$A.all_sub_item",
"as": "B",
"in": {
"$cond": {
"if": { "$eq": [ "$$B.owner", given_user_id ] },
"then": "$$B",
"else": false
}
}
}},
false
]
}
}},
[[]]
]
}
}},
// De-normalize the "two" level array. "Double" $unwind
{ "$unwind": "$all_items" },
{ "$unwind": "$all_items" },
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_items.price" }
}}
],
function(err,results) {
}
)
That process cuts down the size of the items in both arrays "drastically" compared to $redact. The $map operator processes each elment of an array to the given statement within "in". In this case, each "outer" array elment is sent to another $map to process the "inner" elements.
A logical test is performed here with $cond whereby if the "condiition" is met then the "inner" array elment is returned, otherwise the false value is returned.
The $setDifference is used to filter down any false values that are returned. Or as in the "outer" case, any "blank" arrays resulting from all false values being filtered from the "inner" where there is no match there. This leaves just the matching items, encased in a "double" array, e.g:
[[{ "_id": 1, "price": 1, "owner": "b" },{..}],[{..},{..}]]
As "all" array elements have an _id by default with mongoose (and this is a good reason why you keep that) then every item is "distinct" and not affected by the "set" operator, apart from removing the un-matched values.
Process $unwind "twice" to convert these into plain objects in their own documents, suitable for aggregation.
So those are the things you need to know. As I stated earlier, be "aware" of how the data "de-normalizes" and what that implies towards your end totals.
It sounds like you want to, in SQL equivalent, do "sum (prices) WHERE owner IS NOT NULL".
On that assumption, you'll want to do your $match first, to reduce the input set to your sum. So your first stage should be something like
$match: { all_service.all_sub_items.owner : { $exists: true } }
Think of this as then passing all matching documents to your second stage.
Now, because you are summing an array, you have to do another step. Aggregation operators work on documents - there isn't really a way to sum an array. So we want to expand your array so that each element in the array gets pulled out to represent the array field as a value, in its own document. Think of this as a cross join. This will be $unwind.
$unwind: { "$all_service.all_sub_items" }
Now you've just made a much larger number of documents, but in a form where we can sum them. Now we can perform the $group. In your $group, you specify a transformation. The line:
_id: {}, // not sure what to put here
is creating a field in the output document, which is not the same documents as the input documents. So you can make the _id here anything you'd like, but think of this as the equivalent to your "GROUP BY" in sql. The $sum operator will essentially be creating a sum for each group of documents you create here that match that _id - so essentially we'll be "re-collapsing" what you just did with $unwind, by using the $group. But this will allow $sum to work.
I think you're looking for grouping on just your main document id, so I think your $sum statement in your question is correct.
$group : { _id : $_id, totalAmount : { $sum : '$all_service.all_sub_item.price' } }
This will output documents with an _id field equivalent to your original document ID, and your sum.
I'll let you put it together, I'm not super familiar with node. You were close but I think moving your $match to the front and using an $unwind stage will get you where you need to be. Good luck!