Can't fetching all documents from my mongodb on Azure Cosmos DB - node.js

I am trying to get ALL documents from a collection in my Cosmos DB I have on Azure. The collection contains approx 50.000 documents.
I get this error: MongoError: cursor does not exist, was killed or timed out when I am doing this:
const mongoose = require('mongoose');
const mongooseOptions = { useNewUrlParser: true };
mongoose.connect(connectionString, mongooseOptions);
mongoose.set('useCreateIndex', true);
mongoose.Promise = global.Promise;
const mongoDB = mongoose.connection;
mongoDB.on('error', console.error.bind(console, 'MongoDB connection error:'));
const Schema = mongoose.Schema;
const MongoEidModelSchema = new Schema({
uid: { type: String, unique: true },
eid: { type: String, unique: true }
});
const MongoEidModel = mongoose.model('eids', MongoEidModelSchema);
MongoEidModel.find({}, {timeout: false}).then(data => {
console.log(data);
console.log(Object.keys(data).length);
});
When I set a limit of 1000 or 1500 on the find() it works.
I have also tested to change the RU/s on the collection from 400 to 10.000 (in the Azure Portal / console) which also works, but that seems like an expensive solution... doesn't it?
I have also tested to fetch this with find() in batches in a recursive loop until there is no more documents left, with a sleep between each iteration (otherwise Cosmos DB gives me "429: Too many requests" after a while.
Is there a way in which I can get ALL the 50.000 documents using Node.js and Mongoose without changing RU/s or doing recursive loops?
Thanks in advance!
/Daniel

To avoid confusion, I assume you're using the MongoDB driver to access Cosmos in Azure?
For MongoDB, there is a query limit of 16Mb (which you may well be shooting past if you are returning 50k documents). See here: https://docs.mongodb.com/manual/reference/limits/
It is possible that the limitation isn't enforced in the node driver (I haven't inspected its source), in which case it's worth consulting the Azure docs: https://learn.microsoft.com/en-us/azure/cosmos-db/faq
The upshot is, you should really use a cursor to walk across the collection when you are dealing with large numbers of documents like this. See here: How can I use a cursor.forEach() in MongoDB using Node.js?
Hope this helps :)

Related

MongoDB queries are taking 2-3 seconds from Node.js app on Heroku

I am having major performance problems with MongoDB. Simple find() queries are sometimes taking 2,000-3,000 ms to complete in a database with less than 100 documents.
I am seeing this both with a MongoDB Atlas M10 instance and with a cluster that I setup on Digital Ocean on VMs with 4GB of RAM. When I restart my Node.js app on Heroku, the queries perform well (less than 100 ms) for 10-15 minutes, but then they slow down.
Am I connecting to MongoDB incorrectly or querying incorrectly from Node.js? Please see my application code below. Or is this a lack of hardware resources in a shared VM environment?
Any help will be greatly appreciated. I've done all the troubleshooting I know how with Explain query and the Mongo shell.
var Koa = require('koa'); //v2.4.1
var Router = require('koa-router'); //v7.3.0
var MongoClient = require('mongodb').MongoClient; //v3.1.3
var app = new Koa();
var router = new Router();
app.use(router.routes());
//Connect to MongoDB
async function connect() {
try {
var client = await MongoClient.connect(process.env.MONGODB_URI, {
readConcern: { level: 'local' }
});
var db = client.db(process.env.MONGODB_DATABASE);
return db;
}
catch (error) {
console.log(error);
}
}
//Add MongoDB to Koa's ctx object
connect().then(db => {
app.context.db = db;
});
//Get company's collection in MongoDB
router.get('/documents/:collection', async (ctx) => {
try {
var query = { company_id: ctx.state.session.company_id };
var res = await ctx.db.collection(ctx.params.collection).find(query).toArray();
ctx.body = { ok: true, docs: res };
}
catch (error) {
ctx.status = 500;
ctx.body = { ok: false };
}
});
app.listen(process.env.PORT || 3000);
UPDATE
I am using MongoDB Change Streams and standard Server Sent Events to provide real-time updates to the application UI. I turned these off and now MongoDB appears to be performing well again.
Are MongoDB Change Streams known to impact read/write performance?
Change Streams indeed affect the performance of your server. As noted in this SO question.
As mentioned in the accepted answer there,
The default connection pool size in the Node.js client for MongoDB is 5. Since each change stream cursor opens a new connection, the connection pool needs to be at least as large as the number of cursors.
const mongoConnection = await MongoClient.connect(URL, {poolSize: 100});
(Thanks to MongoDB Inc. for investigating this issue.)
You need to increase your pool size to get back your normal performance.
I'd suggest you do more log works. Slow queries after restarted for a while might be worse than you might think.
For a modern database/web app running on a normal machine, it's not very easy to encounter with performance issues if you are doing right. There might be a memory leak or other unreleased resources, or network congestion.
IMHO, you might want to determine whether it's a network problem first, and by enabling slow query log on MongoDB and logging in your code where the query begins and ends, you could achieve this.
If the network is totally fine and you see no MongoDB slow queries, that means something goes wrong in your own application. Detailed logging might really help where query goes slow.
Hope this would help.

.find() returns empty when used with node.js and mongoose but returns data on mongo shell [duplicate]

I have tried using find and findOne and both are not returning a document. find is returning an empty array while findOne is returning null. err in both cases in null as well.
Here is my connection:
function connectToDB(){
mongoose.connect("mongodb://localhost/test"); //i have also tried 127.0.0.1
db = mongoose.connection;
db.on("error", console.error.bind(console, "connection error:"));
db.once("open", function callback(){
console.log("CONNECTED");
});
};
Here is my schema:
var fileSchema = mongoose.Schema({
hash: String,
type: String,
extension: String,
size: String,
uploaded: {type:Date, default:(Date.now)},
expires: {type:Date, default:(Date.now()+oneDay)}
});
var Model = mongoose.model("Model", fileSchema);
And my query is here:
Model.find({},function(err, file) {
console.log(err)
console.log(file);
});
I can upload things to the database and see them via RockMongo but I cannot fetch them after. This my first time using MongoDB so I think I'm just missing some of the fundamentals. Any push in the right direction would be great!
The call to mongoose.model establishes the name of the collection the model is tied to, with the default being the pluralized, lower-cased model name. So with your code, that would be 'models'. To use the model with the files collection, change that line to:
var Model = mongoose.model("Model", fileSchema, "files");
or
var Model = mongoose.model("file", fileSchema);
Simply inorder to avoid pluralization complexity use this:
var Model = mongoose.model("Model", fileSchema, "pure name your db collection");
It's very confusing.[at least for me.]
Had kinda same problem. The solutions above didnt work for me. My app never returns error even if the query is not found. It returns empty array. So i put this in my code:
if(queryResult.length==0) return res.status(404).send("not found");
This issue is probably coming from the fact that you are creating a mongoose model without specifying the name of the collection.
Try changing : const Model = mongoose.model("Model", fileSchema);
To this : const Model = mongoose.model("Model", fileSchema, "NameOfCollection");
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
I had a space in 'Growing Unit' on purpose and it always returned empty array. Removing that space to become 'GrowingUnit' was the fix needed in my scenario.
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
General "hello world" issues (Sometimes this issue not related to mongoose).
Check if the collection is not really empty (mongoDB atlas screenshot).
Check for small spelling differences (Like listing instead of listings) in your collection queries commands.
Check if you use the correct URI for your connection (For example you are trying to retrieve data from a collection that exists in localhost but use mongoDB cluster (Cloud) -or- any other issue related to Connection String URI).
https://docs.mongodb.com/manual/reference/connection-string/
For me the issue was .skip(value), I was passing page=1 instead of page=0.
As I was having few records, I was getting empty array always.

Cannot fetch data from MongoDB using Mongoose [duplicate]

I have tried using find and findOne and both are not returning a document. find is returning an empty array while findOne is returning null. err in both cases in null as well.
Here is my connection:
function connectToDB(){
mongoose.connect("mongodb://localhost/test"); //i have also tried 127.0.0.1
db = mongoose.connection;
db.on("error", console.error.bind(console, "connection error:"));
db.once("open", function callback(){
console.log("CONNECTED");
});
};
Here is my schema:
var fileSchema = mongoose.Schema({
hash: String,
type: String,
extension: String,
size: String,
uploaded: {type:Date, default:(Date.now)},
expires: {type:Date, default:(Date.now()+oneDay)}
});
var Model = mongoose.model("Model", fileSchema);
And my query is here:
Model.find({},function(err, file) {
console.log(err)
console.log(file);
});
I can upload things to the database and see them via RockMongo but I cannot fetch them after. This my first time using MongoDB so I think I'm just missing some of the fundamentals. Any push in the right direction would be great!
The call to mongoose.model establishes the name of the collection the model is tied to, with the default being the pluralized, lower-cased model name. So with your code, that would be 'models'. To use the model with the files collection, change that line to:
var Model = mongoose.model("Model", fileSchema, "files");
or
var Model = mongoose.model("file", fileSchema);
Simply inorder to avoid pluralization complexity use this:
var Model = mongoose.model("Model", fileSchema, "pure name your db collection");
It's very confusing.[at least for me.]
Had kinda same problem. The solutions above didnt work for me. My app never returns error even if the query is not found. It returns empty array. So i put this in my code:
if(queryResult.length==0) return res.status(404).send("not found");
This issue is probably coming from the fact that you are creating a mongoose model without specifying the name of the collection.
Try changing : const Model = mongoose.model("Model", fileSchema);
To this : const Model = mongoose.model("Model", fileSchema, "NameOfCollection");
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
I had a space in 'Growing Unit' on purpose and it always returned empty array. Removing that space to become 'GrowingUnit' was the fix needed in my scenario.
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
General "hello world" issues (Sometimes this issue not related to mongoose).
Check if the collection is not really empty (mongoDB atlas screenshot).
Check for small spelling differences (Like listing instead of listings) in your collection queries commands.
Check if you use the correct URI for your connection (For example you are trying to retrieve data from a collection that exists in localhost but use mongoDB cluster (Cloud) -or- any other issue related to Connection String URI).
https://docs.mongodb.com/manual/reference/connection-string/
For me the issue was .skip(value), I was passing page=1 instead of page=0.
As I was having few records, I was getting empty array always.

Mongoose _id affected before saving

var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/test');
var Cat = mongoose.model('Cat', { name: String });
var kitty = new Cat({ name: 'Zildjian' });
console.log(kitty);
kitty.save();
console.log(kitty);
this output:
{ name: 'Zildjian', _id: 523194d562b0455801000001 } twice
I've tried by delaying the save after a timeout, but it's the same, which points to the _id being set on the new Cat and not the .save()
Is this because of mongodb or mongoose, why is the _id set before the actual persistence?
Most MongoDb drivers will automatically generate the ObjectId/_id client side, including the native driver for Node.js. There's a tiny amount of locking that occurs to generate an ID uniquely, so there's little reason to not distribute the generation to connected clients.
Mongoose needs a unique identifier to track and reference objects, so it creates an identifier immediately.
In the Node.JS client you can optionally set for example the property forceServerObjectId to true to control this behavior.
However, this cannot be overridden when using Mongoose per the docs:
Mongoose forces the db option forceServerObjectId false and cannot be
overridden. Mongoose defaults the server auto_reconnect options to
true which can be overridden. See the node-mongodb-native driver
instance for options that it understands.

Mongoose always returning an empty array NodeJS

I have tried using find and findOne and both are not returning a document. find is returning an empty array while findOne is returning null. err in both cases in null as well.
Here is my connection:
function connectToDB(){
mongoose.connect("mongodb://localhost/test"); //i have also tried 127.0.0.1
db = mongoose.connection;
db.on("error", console.error.bind(console, "connection error:"));
db.once("open", function callback(){
console.log("CONNECTED");
});
};
Here is my schema:
var fileSchema = mongoose.Schema({
hash: String,
type: String,
extension: String,
size: String,
uploaded: {type:Date, default:(Date.now)},
expires: {type:Date, default:(Date.now()+oneDay)}
});
var Model = mongoose.model("Model", fileSchema);
And my query is here:
Model.find({},function(err, file) {
console.log(err)
console.log(file);
});
I can upload things to the database and see them via RockMongo but I cannot fetch them after. This my first time using MongoDB so I think I'm just missing some of the fundamentals. Any push in the right direction would be great!
The call to mongoose.model establishes the name of the collection the model is tied to, with the default being the pluralized, lower-cased model name. So with your code, that would be 'models'. To use the model with the files collection, change that line to:
var Model = mongoose.model("Model", fileSchema, "files");
or
var Model = mongoose.model("file", fileSchema);
Simply inorder to avoid pluralization complexity use this:
var Model = mongoose.model("Model", fileSchema, "pure name your db collection");
It's very confusing.[at least for me.]
Had kinda same problem. The solutions above didnt work for me. My app never returns error even if the query is not found. It returns empty array. So i put this in my code:
if(queryResult.length==0) return res.status(404).send("not found");
This issue is probably coming from the fact that you are creating a mongoose model without specifying the name of the collection.
Try changing : const Model = mongoose.model("Model", fileSchema);
To this : const Model = mongoose.model("Model", fileSchema, "NameOfCollection");
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
I had a space in 'Growing Unit' on purpose and it always returned empty array. Removing that space to become 'GrowingUnit' was the fix needed in my scenario.
const growingUnit= mongoose.model('Growing Unit', growingUnitSchema);
General "hello world" issues (Sometimes this issue not related to mongoose).
Check if the collection is not really empty (mongoDB atlas screenshot).
Check for small spelling differences (Like listing instead of listings) in your collection queries commands.
Check if you use the correct URI for your connection (For example you are trying to retrieve data from a collection that exists in localhost but use mongoDB cluster (Cloud) -or- any other issue related to Connection String URI).
https://docs.mongodb.com/manual/reference/connection-string/
For me the issue was .skip(value), I was passing page=1 instead of page=0.
As I was having few records, I was getting empty array always.

Resources