I am using mongoose and I want to first find documents based on some filter. Then, I again want to again operate on these documents using mongoose methods. I have tried it using the code similar to one shown below. But, this approach doesn't seem to work.
const docs = await myModel.find(filter1);
const x = docs.countDocuments();
const docs2 = await docs.find(filter2);
Of course, I could chain the methods but chaining would not work in this particular case since I want to calculate count of documents as an intermediate step which I won't be able to get if I chain the two find() methods.
Does there exist some way to apply mongoose operations on the returned documents or will I have to make these things work only with js array methods?
I am trying to query an old mongoDB collection in node.js with the native driver.
It uses Legacy UUID as its _id field.
In my code I am using Binary values such as: "2u0kLwUZuEWQvjqOjgQU4g=="
and I want to query the collection with find() operation.
When trying to test it directly with mongoDB I am able to use the following query:
db.getCollection('myCollection').find({_id: BinData(3, "2u0kLwUZuEWQvjqOjgQU4g==")})
to find my item.
But what I need to do is to find multiple items.
So I need to use something like this:
db.getCollection('myCollection').find({_ids: { $in: [BinData(3, "2u0kLwUZuEWQvjqOjgQU4g=="), BinData(3, "3u0kLwUZuEWQvjqOjgQU4g==")...]}})
but it does not seem to work and always returns zero records.
I am not sure why? and what might be the correct way to query multiple Legacy UUIDs?
I am fairly new to MongoDB and Mongoose, I am really confused about why some of the middleware works at the document and some works on query. I am also confused about why some of the query methods return documents and some return queries. If a query is returning document it is acceptable, but why a query return query and what really it is.
Adding more to my question what is a Document function and Model or Query function, because both of them have some common methods like updateOne.
Moreover, I have gathered all these doubts from the mongoose documentation.
Tl;dr: the type of middleware most commonly defines what the this variable in a pre/post hook refers to:
Middleware Hook
'this' refers to the
methods
Document
Document
validate, save, remove, updateOne, deleteOne, init
Query
Query
count, countDocuments, deleteMany, deleteOne, estimatedDocumentCount, find, findOne, findOneAndDelete, findOneAndRemove, findOneAndReplace, findOneAndUpdate, remove, replaceOne, update, updateOne, updateMany
Aggregation
Aggregation object
aggregate
Model
Model
insertMany
Long explanation:
Middlewares are nothing, but built-in methods to interact with the database in different ways. However, as there are different ways to interact with the database, each with different advantages or preferred use-cases, they also behave differently to each other and therefor their middlewares can behave differently, even if they have the same name.
By themselves, middlewares are just shorthands/wrappers for the mongodbs native driver that's being used under the hood of mongoose. Therefor, you can usually use all middlewares, as if you were using regular methods of objects without having to care if it's a Model-, Query-, Aggregation- or Document-Middleware, as long as it does what you want it to.
However, there are a couple of use-cases where it is important to differentiate the context in which these methods are being called.
The most prominent use-case being hooks. Namely the *.pre() and the *.post() hooks. These hooks are methods that you can "inject" into your mongoose setup, so that they are being executed before or after specific events.
For example:
Let's assume I have the following Schema:
const productSchema = new Schema({
name: 'String',
version: {
type: 'Number',
default: 0
}
});
Now, let's say you always want to increase the version field with every save, so that it automatically increases the version field by 1.
The easiest way to do this would be to define a hook that takes care of this for us, so we don't have to care about this when saving an object. If we for example use .save() on the document we just created or fetched from the database, we'd just have to add the following pre-hook to the schema like this:
productSchema.pre('save', function(next) {
this.version = this.version + 1; // or this.version += 1;
next();
});
Now, whenever we call .save() on a document of this schema/model, it will always increment the version before it is actually being saved, even if we only changed the name.
However, what if we don't use the .save() or any other document-only middleware but e.g. a query middleware like findOneAndUpdate() to update an object?
Then, we won't be able to use the pre('save') hook, as .save() won't be called. In this case, we'd have to implement a similar hook for findOneAndUpdate().
Here, however, we finally come to the differences in the middlewares, as the findOneAndUpdate() hook won't allow us to do that, as it is query hook, meaning it does not have access to the actual document, but only to the query itself. So if we e.g. only change the name of the product the following middleware would not work as expected:
productSchema.pre('findOneAndUpdate', function(next) {
// this.version is undefined in the query and would therefor be NaN
this.version = this.version + 1;
next();
});
The reason for this is, that the object is directly updated in the database and not first "downloaded" to nodejs, edited and "uploaded" again. This means, that in this hook this refers to the query and not the document, meaning, we don't know what the current state of version is.
If we were to increment the version in a query like this, we'd need to update the hook as follows, so that it automatically adds the $inc operator:
productSchema.pre('findOneAndUpdate', function(next) {
this.$inc = { version: 1 };
next();
});
Alternatively, we could emulate the previous logic by manually fetching the target document and editing it using an async function. This would be less efficient in this case, as it would always call the db twice for every update, but would keep the logic consistent:
productSchema.pre('findOneAndUpdate', async function() {
const productToUpdate = await this.model.findOne(this.getQuery());
this.version = productToUpdate.version + 1;
next();
});
For a more detailed explanation, please the check the official documentation that also has a designated paragraph for the problem of having colliding naming of methods (e.g. remove() being both a Document and Query middleware method)
Saving an aggregation query using "mongodb": "^3.0.6" as result with the $out operator is only working when calling .toArray().
The aggregation step(s):
let aggregationSteps = [{
$group: {
_id: '$created_at',
}
}, {'$out': 'ProjectsByCreated'}];
Executing the aggregation:
await collection.aggregate(aggregationSteps, {'allowDiskUse': true})
Expected result: New collection called ProjectsByCreated.
Result: No collection, query does not throw an exception but is not being executed? (takes only 1ms)
Appending toArray() results in the expected behaviour:
await collection.aggregate(aggregationSteps, {'allowDiskUse': true}).toArray();
Why does mongodb only create the result collection when calling .toArray() and where does the documentation tell so? How can I fix this?
The documentation doesn't seem to provide any information about this:
https://mongodb.github.io/node-mongodb-native/3.0/api/Collection.html#aggregate
https://docs.mongodb.com/manual/reference/operator/aggregation/out/index.html
MongoDB acknowledge this behaviour, but they also say this is working as designed.
It has been logged as a bug in the MongoDB JIRA, $out aggregation stage doesn't take effect, and the responses say it is not a fault:
This behavior is intentional and has not changed in some time with the node driver. When you "run" an aggregation by calling Collection.prototype.aggregate, we create an intermediary Cursor which is not executed until some sort of I/O is requested. This allows us to provide the chainable cursor API (e.g. cursor.limit(..).sort(..).project(..)), building up the find or aggregate options in a builder before executing the initial query.
... Chaining toArray in order to execute the out stage doesn't feel quite right. Is there something more natural that I haven't noticed?
Unfortunately not, the chained out method there simply continues to build your aggregation. Any of the following methods will cause the initial aggregation to be run: toArray, each, forEach, hasNext, next. We had considered adding something like exec/run for something like change streams, however it's still in our backlog. For now you could theoretically just call hasNext which should run the first aggregation and retrieve the first batch (this is likely what exec/run would do internally anyway).
So, it looks like you do have to call one of the methods to start iterating the cursor before $out will do anything. Adding .toArray(), as you're already doing, is probably safest. Note that to.Array() does not load the entire result into RAM as normal; because it includes a $out, the aggregation returns an empty cursor.
Because the Aggregation operation returns a cursor, not the results.
In order to return all the documents from the cursor, we need to use toArray method.
I'm using an ajax request from the front end to load more comments to a post from the back-end which uses NodeJS and mongoose. I won't bore you with the front-end code and the route code, but here's the query code:
Post.findById(req.params.postId).populate({
path: type, //type will either contain "comments" or "answers"
populate: {
path: 'author',
model: 'User'
},
options: {
sort: sortBy, //sortyBy contains either "-date" or "-votes"
skip: parseInt(req.params.numberLoaded), //how many are already shown
limit: 25 //i only load this many new comments at a time.
}
}).exec(function(err, foundPost){
console.log("query executed"); //code takes too long to get to this line
if (err){
res.send("database error, please try again later");
} else {
res.send(foundPost[type]);
}
});
As was mentioned in the title, everything works fine, my problem is just that this is too slow, the request is taking about 1.5-2.5 seconds. surely mongoose has a method of doing this that takes less to load. I poked around the mongoose docs and stackoverflow, but didn't really find anything useful.
Using skip-and-limit approach with mongodb is slow in its nature because it normally needs to retrieve all documents, then sort them, and after that return the desired segment of the results.
What you need to do to make it faster is to define indexes on your collections.
According to MongoDB's official documents:
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
-- https://docs.mongodb.com/manual/indexes/
Using indexes may cause increased collection size but they improve the efficiency a lot.
Indexes are commonly defined on fields which are frequently used in queries. In this case, you may want to define indexes on date and/or vote fields.
Read mongoose documentation to find out how to define indexes in your schemas:
http://mongoosejs.com/docs/guide.html#indexes