Mongoose fails insertion for insertMany when a specific index is defined - node.js

Mongoose fails to make insertion when the insertMany command is used to insert documents into the database. I have around 2000 documents which I want to insert and instead of saving each one of them one by one I am trying to use the insertMany function for saving it.
If no specific index is defined then it takes a huge time to just save it in the database and if an index is defined the connection gets timed out as soon as the insertion operation takes place.
Model.insertMany(documents, function(batchSaveError, savedDocs) {
if (batchSaveError) {
callback(batchSaveError);
} else {
callback(null);
}
});
This is the code that I am trying to get done.

The issue seemed pretty vague. The connection timeout seemed pretty normal and can happen in any scenario. But whenever the connection times out mongoose tries to reconnect all by itself.
What I was missing was, I was not capturing the error event for the connection safely and that was causing the whole application to crash. Just when I added a proper catch statement for the error event, although it timed out a few times but the insertion went pretty fine and smoot.

Related

SailJs is Deleting Data from pg database

Something strange is happening with my app, I am using SailsJs with official PostgreSQL driver and my data gets deleted. I don't have any pattern or list of specific events which deletes the data but I have following observations.
Few days back i was writing a function to destroy data and when I
executed that function it gave me an error I fixed the error and ran
my web app again and whoa data from one of my table was all gone.
Yesterday i wrote a function and I tried to get the HTTP call to that
function but it was giving me 500 server error, I started debugging it
and after executing my program 3 to 4 times with this error partial
data was deleted from one of my database table. Later the error was i
had a typo in URL.
If any of you guys had any experience with what is happening to me please let me know how to fix it? or at least help me on how to reproduce this issue ?
EDIT
I activated the logs and was waiting for it to happen again and it happened again and here is the log from sailsjs
In the logs I saw that its talking about alter.js sync strategy but i have selected it to be the safe strategy
It has happened to me quite a few times, when lifting the app and it is in the process of making changes to the db and it fails, sometimes due to ORM timeout.
What sails do when its lifting and needs to update the data structure is controlled in config/models.js migrate: 'alter', usually commented out, you get a prompt for what to do 1... 2... 3... (writing from the top of my head, i dont remember the actual messages) and a warning about using alter on a production system.
Changing
config/orm.js to have this
// config/orm.js
module.exports.orm = {
_hookTimeout: 60000 // I used 60 seconds as my new timeout
};
And for reasons I don't know changing config/pubsub.js
// config/pubsub.js
module.exports.pubsub = {
_hookTimeout: 60000 // I used 60 seconds as my new timeout
};
has helped me, avoid data loss.

node-vertica throws exception for multi-queries with a callback

I have created a simple web interface for vertica.
I expose simple operation above a vertica cluster.
one of the functionality I expose is querying vertica.
when my user enters a multi-query the node modul throws an exception and my process exits with exit 1.
Is there any way to catch this exception?
Is there any way overcome the problem in a different way?
Right now there's no way to overcome this when using a callback for the query result.
Preventing this from happening would involve making sure there's only one query in the user's input. This is hard because it involves parsing SQL.
The callback API isn't built to deal with multi-queries. I simply haven't bothered implementing proper handling of this case, because this has never been an issue for me.
Instead of a callback, you could use the event listener API, which will send you lower level messages, and handle this yourself.
q = conn.query("SELECT...; SELECT...");
q.on("fields", function(fields) { ... }); // 1 time per query
q.on("row", function(row) { ... }); // 0...* time per query
q.on("end", function(status) { ... }); // 1 time per query

where to specify "noCursorTimeout" option using nodejs-mongodb driver?

it might be obvious, but right now I'm not able to either find it in the docs or google it...
I'm using mongodb with the nodejs-driver and have a potentially long operation (> 10 minutes) pertaining to a cursor which does get a timeout (as specified in http://docs.mongodb.org/manual/core/cursors/#cursor-behaviors).
In the nodejs-driver API Documentation (http://mongodb.github.io/node-mongodb-native/2.0/api/Cursor.html) a method addCursorFlag(flag, value) is mentioned to be called on a Cursor.
However, there's no example on how to do it, and simply calling e.g.
objectCollection.find().limit(objectCount).addCursorFlag('noCursorTimeout', true).toArray(function (err, objects) {
...
}
leads to a TypeError: Object #<Cursor> has no method 'addCursorFlag'.
So how to go about making this Cursor exist longer than those 10 minutes?
Moreover, as required by the mongodb documentation, how do I then manually close the cursor?
Thanks!
The example you've provided:
db.collection.find().addCursorFlag('noCursorTimeout',true)
..works fine for me on driver version 2.14.21. I've an open cursor for 45 minutes now.
Could it be you were using 1.x NodeJS driver?
so I've got a partial solution for my problem. it's doesn't say so in the API docs, but apparently you have to specify it in the find() options like so:
objectCollection.find({},{timeout: false}).limit(objectCount).toArray(function (err, objects) {
...
}
however still, what about the cleanup? do those cursors ever get killed? is a call to db.close() sufficient?

Cancel previous MongoDB operation from the same client

I have a MongoDB collection of 3257477 cities, and I'm using Mongoose on NodeJS to access it. I'm making requests to it repeatedly (once per 500ms). Requests are usually answered very quickly. However, when I make a bad typo the query takes a long time and requests start to pile up until the initial request is answered. Here are some logs I collected of requests and responses:
21:48:50 started query for "new"
21:48:50 finished query for "new"
21:48:52 started query for "newj ljl" // blockage
21:48:54 started query for "newj"
21:48:55 started query for "new"
21:48:57 started query for "new ye"
21:48:59 started query for "new york"
21:49:08 finished query for "newj ljl" // blockage removed, quick queries flood in
21:49:08 finished query for "new"
21:49:08 finished query for "new york"
21:49:08 finished query for "new ye"
21:49:23 finished query for "newj"
I'm able to cancel the requests made by the client so I'm not worried about queries coming back in the wrong order. And I'm not interested in how to make that query faster at this point, since queries for actual correct spellings are quick.
I'm wondering how a new request can cancel an old request that was made by the same client. In other words "newj ljl" gets canceled when "newj" arrives, "newj" gets canceled when "new" arrives, and so on. If it's just going to be thrown out, why tie up the database?
Is there a proper way to do this?
Update:
I'm aware of db.currentOp().inprog and I'm thinking I can use the client property of the documents within that array to know whether it's a repeat request, but I can't quite figure out how to access that from Mongoose. I'm also not sure when to do that, or how I know which request was spawned from this client (and therefore which to cancel). I'd like an actual code example using Mongoose, or the native NodeJS MongoDB driver if possible!
Here's some sample code to go off of:
models.City.find({ ... })
.exec(function (err, cities) {
});
Below is what I came up with to solve the issue.
I can easily do db.currentOp().inprog and db.killOp() from the Mongo shell, but I really need this to happen automatically, when it needs to, from Mongoose. Since you can reference the MongoDB driver using require('mongoose').connection.db, you can execute those commands by doing "queries" on the following collections:
db.collection('$cmd.sys.inprog');
db.collection('$cmd.sys.killop');
The full solution:
var db = require('mongoose').connection.db,
// get the client IP address
ip = request.headers['x-forwarded-for'] ||
request.connection.remoteAddress ||
request.socket.remoteAddress ||
request.connection.socket.remoteAddress;
// same thing as db.currentOp().inprog
db.collection('$cmd.sys.inprog').findOne(function (err, data) {
if (err) throw err;
data.inprog.filter(function (op) {
// get the operation's client IP address without the port
return ip == op.client.split(':')[0];
}).forEach(function(op){
// same thing as db.killOp()
db.collection('$cmd.sys.killop')
.findOne({ 'op': op.opid }, function (err, data) {
if (err) throw err;
});
});
// start the new cities query
models.City.find({ ... })
.exec(function (err, cities) {
});
});
Helpful links:
https://groups.google.com/forum/#!topic/mongodb-user/1wFp7AqWnM4
drop database with mongoose
How to determine a user's IP address in node
You can try using db.killOp()
http://docs.mongodb.org/manual/reference/method/db.killOp/#db.killOp
UPDATE: You can get the list of current operations from db.currentOp() and identify the operation to be cancelled by matching fields like op, query and client
http://docs.mongodb.org/manual/reference/method/db.currentOp/#db.currentOp
You can definitely do this with killop, and the above solution looks like it could work for the problem as stated. However, I think it may be worthwhile to dig a bit deeper.
The fact that you have a noticeably slow query when you've got a query that's going to return no results seems unusual. That reeks of a full collection scan. The questions to ask are, first, do you have indices set up, and second, are you querying with a general regex? MongoDB doesn't really handle regex searches like { "name" : /.*new york.*/ } particularly well.
Also, the whole "send an http request every time the user hits a key" approach is simple and elegant, but also causes some unnecessary server load. Perhaps a search button or a client-side timeout where you only send a request if a user hasn't hit a key for 1 second could help alleviate the need for the killop approach.

Mongoose and commander

I'm writing some scripts for some command-line manipulation of Mongoose models with commander.js (eventually, I'd like to run these tools using Cron).
Now, I've written several scripts with commander and they all work fine, but if I connect to the MongoDB database using mongoose, they script just hangs after it's done. Now, I figured the database connection is keeping node alive, so I added a mongoose.disconnect() line and it still hangs.
The only thing I found that allows me to shutdown is to use process.exit(), but I'm reluctant to just terminate the process. Is there something in particular that I should do to trigger a graceful shutdown?
My reading of the API docs implies that .disconnect() must be given a callback function. It looks like it's called for each that's disconnected and may be passed an error.
There is a check in the code to make sure it's not called if it doesn't exist when things work out, but that check isn't being run on errors, so if Mongoose received an error message from the MongoDB client, it may be leaving a connection open and that's why it's not stopping execution.
If you're only opening a single connection to the database, you may just want to call [Connection object].close() since that function correctly inserts a no-op "callback" if no callback is given, and looks like it will correctly destruct things.
(The more I look into Mongoose, the more I want to just write a thin wrapper around the MongoDB client so I don't have to deal with Mongoose's "help.")
I use the async "Series" to perform operations and then call mongoose.connection.close() on completion. It prevents callback hell and allows you to neatly perform operations either one at a time or parallel followed by a function when all the other methods have completed. I use it all the time for scripts that require mongoose but are meant to terminate after all mongoose operations are finished.
Shutdown the node program directly is hiding the symptoms, not fixing the problem!
I finally isolated the problem and found it to be with Mongoose schema definitions. If you try to shutdown the connection too soon after Mongoose schemas are defined1, the application hangs and eventually produces some weird MongDB-related error.
Adding a small timeout before running the program.parse(argv) line to run the commander application fixes the problem. Just wrap the code like so:
var program = require('commander')
, mongoose = require('mongoose')
, models = null
;
// Define command line syntax.
program
.command(...)
;
mongoose.connect(
..., // connection parameters.
function() {
// connected to database, defined schemas.
models = require('./models');
// Wait 1 second before running the application code.
setTimeout(function(){
program.parser(process.argv);
}, 1000);
}
);
1: This is my initial interpretation, I have not (yet) extensively tested this theory. However, removing Mongoose schema definitions from the application successfully prevents the application from hanging.
Actually, just using process.nextTick() instead of the setTimeout() call fixes the situation nicely!

Resources