How to fetch/count millions of records in mongodb with nodejs

How to fetch/count millions of records in mongodb with nodejs - node.js

We have a collection with millions of records in mongoDB. its taking lots of time and time out to count and create pagination with these records. whats the best way to do it using nodejs. I want to create a page where I see records with pagination, count, delete, search of records. Below is the code which doing query to Mongo with different conditions.
crowdResult.find({ "auditId":args.audit_id,"isDeleted":false})
.skip(args.skip)
.limit(args.limit)
.exec(function (err, data) {
if (err)
return callback(err,null);
console.log(data);
return callback(null,data);
})

If the goal is to get through a large dataset without timing out then I use the following approach to get pages one after another and process the paged resultset as soon as it becomes available:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f
Please focus on the following lines to get a quick idea of what the code is doing before diving deeper:
Let getPage() handle the work, you can set the pageSize and query to your liking:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L68
Method signature:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L29
Process pagedResults as soon as they become available:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L49
Move on to the next page:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L53
The code will stop when there is no more data left:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L41
Or it will stop when working on the last page of data:
https://gist.github.com/pulkitsinghal/2f3806670439fa137210fc26b134237f#file-sample-js-L46
I hope this offers some inspiration, even if its not an exact solution for your needs.

Related

Query all Table in Sails.js

I have a project that requires syncing where in syncing is to gather all the data from all tables at start up. Kinda easy.
However with Node.js with the framework Sails.js, I cant seem to find a way to do so as one model is equal to one table, all laid out in projectName/api/models/ as a single file for each.
My initial idea was to loop everything in that directory to be able to do my query for each item, however it doesn't work as I have tried.
Here is my source code for the simple query for only one model:
modelName.getDatastore().sendNativeQuery('SELECT * FROM table WHERE id = 0' ,function(err, res) {
if (err) {
console.log(err);
return exits.success(err);
}
return exits.success(res);
});
With what I have tried (not in my sample above), I changed the modelName into string to test out if looping the directory works, which it doesn't. I also tried temporarily creating a simple variable that represents one of the model's name and used it for query, which also didn't work. I'm at my wit's end and can't find a solution even in google. Any help?

mongoose query using sort and skip on populate is too slow

I'm using an ajax request from the front end to load more comments to a post from the back-end which uses NodeJS and mongoose. I won't bore you with the front-end code and the route code, but here's the query code:
Post.findById(req.params.postId).populate({
path: type, //type will either contain "comments" or "answers"
populate: {
path: 'author',
model: 'User'
},
options: {
sort: sortBy, //sortyBy contains either "-date" or "-votes"
skip: parseInt(req.params.numberLoaded), //how many are already shown
limit: 25 //i only load this many new comments at a time.
}
}).exec(function(err, foundPost){
console.log("query executed"); //code takes too long to get to this line
if (err){
res.send("database error, please try again later");
} else {
res.send(foundPost[type]);
}
});
As was mentioned in the title, everything works fine, my problem is just that this is too slow, the request is taking about 1.5-2.5 seconds. surely mongoose has a method of doing this that takes less to load. I poked around the mongoose docs and stackoverflow, but didn't really find anything useful.

Using skip-and-limit approach with mongodb is slow in its nature because it normally needs to retrieve all documents, then sort them, and after that return the desired segment of the results.
What you need to do to make it faster is to define indexes on your collections.
According to MongoDB's official documents:
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
-- https://docs.mongodb.com/manual/indexes/
Using indexes may cause increased collection size but they improve the efficiency a lot.
Indexes are commonly defined on fields which are frequently used in queries. In this case, you may want to define indexes on date and/or vote fields.
Read mongoose documentation to find out how to define indexes in your schemas:
http://mongoosejs.com/docs/guide.html#indexes

Speeding up my cloudant query

I was wondering whether someone could provide some advice on my cloudant query below. It is now taking upwards of 20 seconds to execute against a DB of 50,000 documents - I suspect I could be getting better speed than this.
The purpose of the query is to find all of my documents with the attribute "searchCode" equalling a specific value plus a further list of specific IDs.
Both searchCode and _id are indexed - any ideas why my query would be taking so long / what I could do to speed it up?
mydb.find({selector: {"$or":[{"searchCode": searchCode},{"_id":{"$in":idList}}]}}, function (err, result) {
if(!err){
fulfill(result.docs);
}
else{
console.error(err);
}
});
Thanks,
James

You could try doing separate calls for the queries
find me documents where the searchCode = 'some value'
find me documents whose ids match a list of ids
The first can be achieved with a find call and a query like so:
{ selector: {"searchCode": searchCode} }
The second can be achieved by hitting the databases's _all_docs endpoint, passing in the list of ids as a keys parameter e.g.
GET /db/_all_docs?keys=["a","b","c"]
You might find that running both requests in parallel and merging the results gives you better performance.

find() operation taking longer time when MongoDB collection contains docs around 200K

My MongoDB database contains a collection with 200 k documents. I'm trying to fetch all documents in NodeJS as follows
var cursor = collection.find({}, {
"_id" : false,
}).toArray(function(err, docs) {
if (err)
throw err;
callback(null, docs);
});
The above operation is taking longer time and I could not able to get results. Is there any way to optimize find operation to get the result ?
NodeJS driver version :2.0
MongoDB version :3.2.2
I can easily load data from json raw file but I could not able to do it from MongoDB

People can't do a lot with 200k items in the UI. Google shows only 10 results per page, for good reason. Sounds like pagination can help you. Here's an example: Range query for MongoDB pagination

Mongoose: How to slice the entire query?

I'm looking for a way to get M documents out of a particular query, starting at the Nth document, without rendering the entire collection at the exec() callback and then splice an array from there. I'm well aware of .limit(x) which works fine and dandy to get from 0 to x, but to my knowledge there is no way I select where does the query start limiting the number of documents, something like limit(10) starting from 5.
I tried something like this:
Model.find().sort({creationDate: -1}).where("_id").splice([5,10]).exec(function(err, data) {
if(err) res.send(502, "ERROR IN DB DATABASE");
res.send(data);
});
But the resulting data consists of the entire collection.
Any ideas on how to achieve this?

.skip is what you are looking for
Model.find(...).sort(...).skip(5).limit(10).exec(....)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to fetch/count millions of records in mongodb with nodejs - node.js

Related

Query all Table in Sails.js

mongoose query using sort and skip on populate is too slow

Speeding up my cloudant query

find() operation taking longer time when MongoDB collection contains docs around 200K

Mongoose: How to slice the entire query?

Categories

Resources