MongoDB: auto sort documents in ascending order

MongoDB: auto sort documents in ascending order - node.js

I have a list of documents in a database collection. The documents ordered get changed by a front-end drag-drop functionality. order field of a document sort all documents in ascending order. For example,
[{
order: 1,
no: 1
......
},
{
order: 2,
no: 2
......
},
{
order: 3,
no: 3
......
}]
Now. when drag-drop is applied it may change to different formats. For example -
[{
order: 3,
no: 1
......
},
{
order: 1,
no: 2
......
},
{
order: 2,
no: 3
......
}]
Now it is not an issue to sort the list from the front end. Later by clicking a button, a new record is added to the top of the list. It's not a problem at all to sort again to the ascending order from the client side.
Actual Issue
I have pagination on the page so every time I don't have all items to sort from the client-side programming. There should have only/best way to sort them from the MongoDB query. For example, using the following query I can sort documents -
db.collection.aggregate([
{
"$sort": {
order: 1
}
}
])
But there is a problem if order field has duplicate numbers or doesn't have any number the above query can't sort it. Is there any query to sort the order field automatically in ascending order regardless have duplicate numbers or no value at all?
For example, if we have documents in the following format -
[
{
order: null,
no: 1,
...
},
{
order: 1,
no: 2,
...
},
{
order: 1,
no: 3,
...
},
{
order: 3,
no: 4,
...
},
]
it should be converted to -
[
{
order: 1,
no: 1,
...
},
{
order: 2,
no: 2,
...
},
{
order: 3,
no: 3,
...
},
{
order: 4,
no: 4,
...
},
]

We can use $addFields and $ifNull to set a default value of nullable field on query
db.collection.aggregate([
{
$addFields: {
order: { $ifNull: ["$order", "$order", -1] }, // order will be -1 if it is null
},
}
])
.sort({ order: 1 });

Related

get result value as 0 if no data found using nodejs and mongodb

I have data collection which has different students with diff marks,I want to count how many students got certain numbers of 100,90,80,70,60 etc marks and if no student has that specific mark, I want that resulting output as 0
Eg:
,
{ $group: { "_id": studentMarks, "studentMarksTotal": { $sum: 1 } } },
{ $sort: { _id: 1 } },
{
$project: {
_id: 0,
studentMarksTotal: 1,
studentMarks:{
$cond: { if: { $eq: ["$_id", "$studentMarks"] }, then: "$_id", else: "0" },
,
},
}}
]
this gives the result for values that have count
Eg: studentMarks:70,
studentMarksTotal:9
studentMarks:90,
studentMarksTotal:6
but I also want count data if no student got certain marks
eg:studentMarks:70,
studentMarksTotal:9
studentMarks:90,
studentMarksTotal:6
studentMarks:100,
studentMarksTotal:0
How can I achieve not, I have tried $ifNull and if else also but am unable to get the desired result
If anyone could help me with this, Thanks in advance.

MongoDB Sort documents by the most recents of three fields

I would like to sort a collection by the most recent field of three ones.
Imagine each one of my documents has a field dates with 3 dates inside.
{
...,
dates: {
dateOne: "2010-01-01"
dateTwo: "2020-01-01",
dateThree: "2015-01-01",
},
...
}
I want to fetch my documents sorted by the most recent date of dates, dateTwo in this case, but the most recent could be dateOne in another document.
Knowing that there could be the three fields or only one as there could be two.

The $max operator can take a list of fields to compare. It will return the largest one. It's okay with fields that don't exist, so it doesn't matter if a document is missing dateTwo, for example.
db.collection.aggregate([
{
$project: {
_id: 1,
key: 1,
dates: 1,
latestDate: {
$max: [
"$dates.dateOne",
"$dates.dateTwo",
"$dates.dateThree"
]
}
}
},
{
$sort: {
latestDate: 1
}
}
])
(example)

MongoDB Update (Aggregation Pipeline) using values from nested nested Arrays

I'm trying to update a field (totalPrice) in my document(s) based on a value in a nested, nested array (addOns > get matching array from x number of arrays > get int in matching array[always at pos 2]).
Here's an example of a document with 2 arrays nested in addOns:
{
"_id": ObjectID('.....'),
"addOns": [
["appleId", "Apples", 2],
["bananaID", "Bananas", 1]
],
"totalPrice": 5.7
}
Let's say the price of bananas increased by 20cents, so I need to look for all documents that have bananas in its addOns, and then increase the totalPrice of these documents depending on the number of bananas bought. I plan on using a updateMany() query using an aggregation pipeline that should roughly look like the example below. The ??? should be the number of bananas bought, but I'm not sure how to go about retrieving that value. So far I've thought of using $unwind, $filter, and $arrayElemAt, but not sure how to use them together. Would appreciate any help!
db.collection('orders').updateMany(
{ $elemMatch: { $elemMatch: { $in: ['bananaID'] } } },
[
{ $set: {'totalPrice': { $add: ['$totalPrice', { $multiply: [{$toDecimal: '0.2'}, ???] } ] } } }
]
I'm not exactly sure whats my mongo version is, but I do know that I can use the aggregation pipeline because I have other updateMany() calls that also use the pipeline without any issues, so it should be (version >= 4.2).
**Edit: Thought of this for the ???, the filter step seems to work somewhat as the documents are getting updated, but the value is null instead of the updated price. Not sure why
Filter the '$addOns' array to return the matching nested array.
For the condition, use $eq to check if the element at [0] ('$$this.0') matches the string 'bananaID', and if it does return this nested array.
Get the array returned from step 1 using $arrayElemAt and position [0].
Use $arrayElemAt again on the array from step 3 with positon [2] as it is the index of the quantity element
{ $arrayElemAt: [{ $arrayElemAt: [ { $filter: { input: "$addOns", as:'this', cond: { $eq: ['$$this.0', 'bananaID'] } } } , 0 ] }, 2 ] }

Managed to solve it myself - though only use this if you don't mind the risk of updateMany() as stated by Joe in the question's comments. It's very similar to what I originally shared in **Edit, except you cant use $$this.0 to access elements in the array.
Insert this into where the ??? is and it'll work, below this code block is the explanation:
{ $arrayElemAt: [{ $arrayElemAt: [ { $filter: { input: "$addOns", as:'this', cond: { $eq: [{$arrayElemAt:['$$this', 0]}, 'bananaID'] } } } , 0 ] }, 2 ] }
Our array looks like this: [["appleId", "Apples", 2], ["bananaID", "Bananas", 1]]
Use $filter to return a subset of our array that only contains the elements that matches our condition - in this case we want the array for bananas. $$this represents each element in input, and we check whether the first element of $$this matches 'bananaID'.
{ $filter: { input: "$addOns", as:'this', cond: { $eq: [{$arrayElemAt:['$$this', 0]}, 'bananaID'] } } }
// how the result from $filter should look like
// [["bananaID", "Bananas", 1]]
Because the nested banana array will always be the first element, we use $arrayElemAt on the result from step 2 to retrieve the element at position 0.
// How it looks like after step 3: ["bananaID", "Bananas", 1]
The last step is to use $arrayElemAt again, except this time we want to retrieve the element at position 2 (quantity of bananas bought)
This is how the final updateMany() query looks like after steps 1-4 are evaluated.
db.collection('orders').updateMany(
{ $elemMatch: { $elemMatch: { $in: ['bananaID'] } } },
[
{ $set: {'totalPrice': { $add: ['$totalPrice', { $multiply: [{$toDecimal: '0.2'}, 1] } ] } } }
], *callback function code*)
// notice how the ??? is now replaced by the quantity of bananas which is 1

Conditional Projection if element exists in Array in mongodb

Is there a direct way to project a new field if a value matches one in a huge sub array. I know i can use the $elemMatch or $ in the $match condition, but that would not allow me to get the rest of the non matching values (users in my case).
Basically i want to list all type 1 items and show all the users while highlighting the subscribed user. The reason i want to do this through mongodb is to avoid iterating over multiple thousand users for every item. Infact that is the part 2 of my question, can i limit the number of user's array that would be returned, i just need around 10 array values to be returned not thousands.
The collection structure is
{
name: "Coke",
type: 2,
users:[{user: 13, type:1},{ user:2: type:2}]
},
{
name: "Adidas",
type: 1,
users:[{user:31, type:3},{user: 51, type:1}]
},
{
name: "Nike",
type: 1,
users:[{user:21, type:3},{user: 31, type:1}]
}
Total documents are 200,000+ and growing...
Every document has 10,000~50,000 users..
expected return
{
isUser: true,
name: "Adidas",
type: 1,
users:[{user:31, type:3},{user: 51, type:1}]
},
{
isUser: false,
name: "Nike",
type: 1,
users:[{user:21, type:3},{user: 31, type:1}]
}
and i've been trying this
.aggregate([
{$match:{type:1}},
{$project:
{
isUser:{$elemMatch:["users.user",51]},
users: 1,
type:1,
name: 1
}
}
])
this fails, i get an error "Maximum Stack size exceeded". Ive tried alot of combinations and none seem to work. I really want to avoid running multiple calls to mongodb. Can this be done in a single call?
I've been told to use unwind, but i am bit worried that it might lead to memory issues.
If i was using mysql, a simple subquery would have done the job... i hope i am overlooking a similar simple solution in mongodb.

Process the conditions for the array elements and match the result by using a combination of the $anyElementTrue which evaluates an array as a set and returns true if any of the elements are true and false otherwise, the $ifNull operator will act as a safety net that evaluates the following $map expression and returns the value of the expression if the expression evaluates to a non-null value. The $map in the $ifNull operator is meant to apply the conditional statement expression to each item in the users array and returns an array with the applied results. The resulting array will then be used evaluated by the $anyElementTrue and this will ultimately calculate and return the isUser field for each document:
db.collection.aggregate([
{ "$match": { "type": 1} },
{
"$project": {
"name": 1, "type": 1,
"isUser": {
"$anyElementTrue": [
{
'$ifNull': [
{
"$map": {
"input": "$users",
"as": "el",
"in": { "$eq": [ "$$el.user",51] }
}
},
[false]
]
}
]
}
}
}
])

Mongoose aggregation "$sum" of rows in sub document

I'm fairly good with sql queries, but I can't seem to get my head around grouping and getting sum of mongo db documents,
With this in mind, I have a job model with schema like below :
{
name: {
type: String,
required: true
},
info: String,
active: {
type: Boolean,
default: true
},
all_service: [
price: {
type: Number,
min: 0,
required: true
},
all_sub_item: [{
name: String,
price:{ // << -- this is the price I want to calculate
type: Number,
min: 0
},
owner: {
user_id: { // <<-- here is the filter I want to put
type: Schema.Types.ObjectId,
required: true
},
name: String,
...
}
}]
],
date_create: {
type: Date,
default : Date.now
},
date_update: {
type: Date,
default : Date.now
}
}
I would like to have a sum of price column, where owner is present, I tried below but no luck
Job.aggregate(
[
{
$group: {
_id: {}, // not sure what to put here
amount: { $sum: '$all_service.all_sub_item.price' }
},
$match: {'not sure how to limit the user': given_user_id}
}
],
//{ $project: { _id: 1, expense: 1 }}, // you can only project fields from 'group'
function(err, summary) {
console.log(err);
console.log(summary);
}
);
Could someone guide me in the right direction. thank you in advance

Primer
As is correctly noted earlier, it does help to think of an aggregation "pipeline" just as the "pipe" | operator from Unix and other system shells. One "stage" feeds input to the "next" stage and so on.
The thing you need to be careful with here is that you have "nested" arrays, one array within another, and this can make drastic differences to your expected results if you are not careful.
Your documents consist of an "all_service" array at the top level. Presumably there are often "multiple" entries here, all containing your "price" property as well as "all_sub_item". Then of course "all_sub_item" is an array in itself, also containg many items of it's own.
You can think of these arrays as the "relations" between your tables in SQL, in each case a "one-to-many". But the data is in a "pre-joined" form, where you can fetch all data at once without performing joins. That much you should already be familiar with.
However, when you want to "aggregate" accross documents, you need to "de-normalize" this in much the same way as in SQL by "defining" the "joins". This is to "transform" the data into a de-normalized state that is suitable for aggregation.
So the same visualization applies. A master document's entries are replicated by the number of child documents, and a "join" to an "inner-child" will replicate both the master and initial "child" accordingly. In a "nutshell", this:
{
"a": 1,
"b": [
{
"c": 1,
"d": [
{ "e": 1 }, { "e": 2 }
]
},
{
"c": 2,
"d": [
{ "e": 1 }, { "e": 2 }
]
}
]
}
Becomes this:
{ "a" : 1, "b" : { "c" : 1, "d" : { "e" : 1 } } }
{ "a" : 1, "b" : { "c" : 1, "d" : { "e" : 2 } } }
{ "a" : 1, "b" : { "c" : 2, "d" : { "e" : 1 } } }
{ "a" : 1, "b" : { "c" : 2, "d" : { "e" : 2 } } }
And the operation to do this is $unwind, and since there are multiple arrays then you need to $unwind both of them before continuing any processing:
db.collection.aggregate([
{ "$unwind": "$b" },
{ "$unwind": "$b.d" }
])
So there the "pipe" first array from "$b" like so:
{ "a" : 1, "b" : { "c" : 1, "d" : [ { "e" : 1 }, { "e" : 2 } ] } }
{ "a" : 1, "b" : { "c" : 2, "d" : [ { "e" : 1 }, { "e" : 2 } ] } }
Which leaves a second array referenced by "$b.d" to further be de-normalized into the the final de-normalized result "without any arrays". This allows other operations to process.
Solving
With just about "every" aggregation pipeline, the "first" thing you want to do is "filter" the documents to only those that contain your results. This is a good idea, as especially when doing operations such as $unwind, then you don't want to be doing that on documents that do not even match your target data.
So you need to match your "user_id" at the array depth. But this is only part of getting the result, since you should be aware of what happens when you query a document for a matching value in an array.
Of course, the "whole" document is still returned, because this is what you really asked for. The data is already "joined" and we haven't asked to "un-join" it in any way.You look at this just as a "first" document selection does, but then when "de-normalized", every array element now actualy represents a "document" in itself.
So not "only" do you $match at the beginning of the "pipeline", you also $match after you have processed "all" $unwind statements, down to the level of the element you wish to match.
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// De-normalize arrays
{ "$unwind": "$all_service" },
{ "$unwind": "$all_service.all_subitem" },
// Match again to filter the array elements
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_service.all_sub_item.price" }
}}
],
function(err,results) {
}
)
Alternately, modern MongoDB releases since 2.6 also support the $redact operator. This could be used in this case to "pre-filter" the array content before processing with $unwind:
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Filter arrays for matches in document
{ "$redact": {
"$cond": {
"if": {
"$eq": [
{ "$ifNull": [ "$owner", given_user_id ] },
given_user_id
]
},
"then": "$$DESCEND",
"else": "$$PRUNE"
}
}},
// De-normalize arrays
{ "$unwind": "$all_service" },
{ "$unwind": "$all_service.all_subitem" },
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_service.all_sub_item.price" }
}}
],
function(err,results) {
}
)
That can "recursively" traverse the document and test for the condition, effectively removing any "un-matched" array elements before you even $unwind. This can speed things up a bit since items that do not match would not need to be "un-wound". However there is a "catch" in that if for some reason the "owner" did not exist on an array element at all, then the logic required here would count that as another "match". You can always $match again to be sure, but there is still a more efficient way to do this:
Job.aggregate(
[
// Match to filter possible "documents"
{ "$match": {
"all_service.all_sub_item.owner": given_user_id
}},
// Filter arrays for matches in document
{ "$project": {
"all_items": {
"$setDifference": [
{ "$map": {
"input": "$all_service",
"as": "A",
"in": {
"$setDifference": [
{ "$map": {
"input": "$$A.all_sub_item",
"as": "B",
"in": {
"$cond": {
"if": { "$eq": [ "$$B.owner", given_user_id ] },
"then": "$$B",
"else": false
}
}
}},
false
]
}
}},
[[]]
]
}
}},
// De-normalize the "two" level array. "Double" $unwind
{ "$unwind": "$all_items" },
{ "$unwind": "$all_items" },
// Group on the "_id" for the "key" you want, or "null" for all
{ "$group": {
"_id": null,
"total": { "$sum": "$all_items.price" }
}}
],
function(err,results) {
}
)
That process cuts down the size of the items in both arrays "drastically" compared to $redact. The $map operator processes each elment of an array to the given statement within "in". In this case, each "outer" array elment is sent to another $map to process the "inner" elements.
A logical test is performed here with $cond whereby if the "condiition" is met then the "inner" array elment is returned, otherwise the false value is returned.
The $setDifference is used to filter down any false values that are returned. Or as in the "outer" case, any "blank" arrays resulting from all false values being filtered from the "inner" where there is no match there. This leaves just the matching items, encased in a "double" array, e.g:
[[{ "_id": 1, "price": 1, "owner": "b" },{..}],[{..},{..}]]
As "all" array elements have an _id by default with mongoose (and this is a good reason why you keep that) then every item is "distinct" and not affected by the "set" operator, apart from removing the un-matched values.
Process $unwind "twice" to convert these into plain objects in their own documents, suitable for aggregation.
So those are the things you need to know. As I stated earlier, be "aware" of how the data "de-normalizes" and what that implies towards your end totals.

It sounds like you want to, in SQL equivalent, do "sum (prices) WHERE owner IS NOT NULL".
On that assumption, you'll want to do your $match first, to reduce the input set to your sum. So your first stage should be something like
$match: { all_service.all_sub_items.owner : { $exists: true } }
Think of this as then passing all matching documents to your second stage.
Now, because you are summing an array, you have to do another step. Aggregation operators work on documents - there isn't really a way to sum an array. So we want to expand your array so that each element in the array gets pulled out to represent the array field as a value, in its own document. Think of this as a cross join. This will be $unwind.
$unwind: { "$all_service.all_sub_items" }
Now you've just made a much larger number of documents, but in a form where we can sum them. Now we can perform the $group. In your $group, you specify a transformation. The line:
_id: {}, // not sure what to put here
is creating a field in the output document, which is not the same documents as the input documents. So you can make the _id here anything you'd like, but think of this as the equivalent to your "GROUP BY" in sql. The $sum operator will essentially be creating a sum for each group of documents you create here that match that _id - so essentially we'll be "re-collapsing" what you just did with $unwind, by using the $group. But this will allow $sum to work.
I think you're looking for grouping on just your main document id, so I think your $sum statement in your question is correct.
$group : { _id : $_id, totalAmount : { $sum : '$all_service.all_sub_item.price' } }
This will output documents with an _id field equivalent to your original document ID, and your sum.
I'll let you put it together, I'm not super familiar with node. You were close but I think moving your $match to the front and using an $unwind stage will get you where you need to be. Good luck!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string