NodeJS writes to MongoDB only once - node.js

I have a NodeJS app that is supposed to generate a lot of data sets in a synchronous manner (multiple nested for-loops). Those data sets are supposed to be saved to my MongoDB database to look them up more effectively later on.
I use the mongodb - driver for NodeJS and have a daemon running. The connection to the DB is working fine and according to the daemon window the first group of datasets is being successfully stored. Every ~400-600ms there is another group to store but after the first dataset there is no output in the MongoDB console anymore (not even an error), and as the file sizes doesn't increase i assume those write operations don't work (i cant wait for it to finish as it'd take multiple days to fully run).
If i restart the NodeJS script it wont even save the first key anymore, possibly because of duplicates? If i delete the db folder content the first one will be saved again.
This is the essential part of my script and i wasn't able to find anything that i did wrong. I assume the problem is more in the inner logic (weird duplicate checks/not running concurrent etc).
var MongoClient = require('mongodb').MongoClient, dbBuffer = [];
MongoClient.connect('mongodb://127.0.0.1/loremipsum', function(err, db) {
if(err) return console.log("Cant connect to MongoDB");
var collection = db.collection('ipsum');
console.log("Connected to DB");
for(var q=startI;q<endI;q++) {
for(var w=0;w<words.length;w++) {
dbBuffer.push({a:a, b:b});
}
if(dbBuffer.length) {
console.log("saving "+dbBuffer.length+" items");
collection.insert(dbBuffer, {w:1}, function(err, result) {
if(err) {
console.log("Error on db write", err);
db.close();
process.exit();
}
});
}
dbBuffer = [];
}
db.close();
});
Update
db.close is never called and the connection doesn't drop
Changing to bulk insert doesn't change anything
The callback for the insert is never called - this could be the problem! The MongoDB console does tell me that the insert process was successful but it looks like the communication between driver and MongoDB isn't working properly for insertion.

I "solved" it myself. One misconception that i had was that every insert transaction is confirmed in the MongoDB console while it actually only confirms the first one or if there is some time between the commands. To check if the insert process really works one needs to run the script for some time and wait for MongoDB to dump it in the local file (approx. 30-60s).
In addition, the insert processes were too quick after each other and MongoDB appears to not handle this correctly under Win10 x64. I changed from the Array-Buffer to the internal buffer (see comments) and only continued with the process after the previous data was inserted.
This is the simplified resulting code
db.collection('seedlist', function(err, collection) {
syncLoop(0,0, collection);
//...
});
function syncLoop(q, w, collection) {
batch = collection.initializeUnorderedBulkOp({useLegacyOps: true});
for(var e=0;e<words.length;e++) {
batch.insert({a:a, b:b});
}
batch.execute(function(err, result) {
if(err) throw err;
//...
return setTimeout(function() {
syncLoop(qNew,wNew,collection);
}, 0); // Timer to prevent Memory leak
});
}

Related

Inserting document into Mongodb with NodeJs returns document but it is not in collection

I have a weird issue where I have a test to see prove that you cannot create a new user if a user with the same name/email exists, however it always seems to fail. However when I look at the first step which adds a user to the database with the details expected, it inserts and calls back with the document:
[ { Username: 'AccountUser',
Email: 'some#email.com',
CreatedDate: Mon Sep 22 2014 12:52:48 GMT+0100 (GMT Summer Time),
_id: 54200d90d0a34ffc1565df13 } ]
So if I do a console.log of the documents returned from insert thats what I get which is ok, and the _id has been set by mongo which is correct, so I know the call succeeded, and there was no errors.
However if I then go and view the database (MongoVUE) that collection is empty, so I am a bit baffled as to why this is happening.
I am using pooled mongodb connections (i.e setting up during app setup then using app.set("database_connection", database);. So I am a bit baffled as to why this is happening?
Here is some example code, which is part of larger code but basically is contained within promises:
// Point of connection creation
var mongoClient = mongodb.MongoClient;
var connectionCallback = function (err, database) {
if (err) { reject(err); }
resolve(database);
};
try
{ mongoClient.connect(connectionString, connectionCallback); }
catch(exception)
{ reject(exception); }
// Point of usage
var usersCollection = connection.collection("users");
usersCollection.insert(userDocument, function(err, docs) {
if(err) {
reject(err);
return;
}
console.log(docs);
resolve(docs[0]);
});
So I create the connection and pass back the database for use elsewhere, then at the point of inserting I get the connection, get the collection and then insert into it, and log the documents added, which is what is in the above log output.
Also finally running:
Windows 8.1
Node 0.11.13
npm mongodb latest
MongoDB 2.6.1
Not the answer I was hoping for, but I restarted the computer (rare occurrence) and the issue no longer occurs. I have no idea what was causing it and I am still slightly worried incase it happens again, but for now I am up and running again.

how insert csv data to mongodb with nodejs

Hi im developing an app with nodeJS, express and a mongoDB, i need to take users data from a csv file and upload it to my database this db has a schema designed with mongoose.
but i don know how to do this, what is the best approach to read the csv file check for duplicates against the db and if the user (one column in the csv) is not here insert it?
are there some module to do this? or i need to build it from scratch? im pretty new to nodeJS
i need a few advices here
Thanks
this app have an angular frontend so the user can upload the file, maybe i should read the csv in the front end and transform it into an array for node, then insert it?
Use one of the several node.js csv libraries like this one, and then you can probably just run an upsert on the user name.
An upsert is an update query with the upsert flag set to true: {upsert: true}. This will insert a new record only if the search returns zero results. So you query may look something like this:
db.collection.update({username: userName}, newDocumentObj, {upsert: true})
Where userName is the current username you're working with and newDocumentObj is the json document that may need to be inserted.
However, if the query does return a result, it performs an update on those records.
EDIT:
I've decided that an upsert is not appropriate for this but I'm going to leave the description.
You're probably going to need to do two queries here, a find and a conditional insert. For this find query I'd use the toArray() function (instead of a stream) since you are expecting 0 or 1 results. Check if you got a result on the username and if not insert the data.
Read about node's mongodb library here.
EDIT in response to your comment:
It looks like you're reading data from a local csv file, so you should be able to structure you program like:
function connect(callback) {
connStr = 'mongodb://' + host + ':' + port + '/' + schema; //command line args, may or may not be needed, hard code if not I guess
MongoClient.connect(connStr, function(err, db) {
if(err) {
callback(err, null);
} else {
colObj = db.collection(collection); //command line arg, hard code if not needed
callback(null, colObj);
}
});
}
connect(function(err, colObj) {
if(err) {
console.log('Error:', err.stack);
process.exit(0);
} else {
console.log('Connected');
doWork(colObj, function(err) {
if(err) {
console.log(err.stack);
process.exit(0);
}
});
}
});
function doWork(colObj, callback) {
csv().from('/path/to/file.csv').on('data', function(data) {
//mongo query(colObj.find) for data.username or however the data is structured
//inside callback for colObj.find, check for results, if no results insert data with colObj.insert, callback for doWork inside callback for insert or else of find query check
});
}

Node JS + Mongoose: saving documents inside while()

I am just playing around with Node and Mongoose and I am curious about the following issue:
I am trying to save documents to mongo from within a loop / interval.
The following works just fine:
setInterval(function(){
var doc = new myModel({ name: 'test' });
doc.save(function (err, doc){
if (err) return console.error(err);
doc.speak();
});
}, 1);
The following does not work:
while(true){
var doc = new myModel({ name: 'test' });
doc.save(function (err, doc){
if (err) return console.error(err);
doc.speak();
});
}
What is the explanation for this behavior? The save callback is never firing in scenario 2
Additionally, can someone comment on best practices for building "long running workers"? I am interested in using node to build background workers to process queues of data. Is a while() a bad idea? setInterval()? Additionally, I plan to use the forever module to keep the process alive
Thanks!
Node.js is single threaded so while(true) will fully occupy the single thread, never giving the doc.save callback a chance to run.
The second part of your question is too broad though, and you should really only ask one question at a time anyway.

Manually shutdown mongod.exe won't fire an error using node-mongodb-native

this is my first post on here.
I am learning Node and Mongodb. I have installed the node-mongodb-native driver and found some unexpected things.
My script is below, based on the official tutorial.
var test; //short cut for later use
require('mongodb').MongoClient.connect("mongodb://localhost:27017/exampleDb", function(err, db) {
if (err) {return console.log("some error",err);}
console.log("database connected");
test = db.collection('test');
test.find().toArray(function(err, items) {
if (err) {console.log("some error");}
else {console.log("still connected");}
});
});
var rep = function () {
setTimeout(function () {
test.find().toArray(function(err, items) {
if (err) {console.log("some error: " + err);}
else {console.log("still connected");}
});
rep();
}, 2000);
}
rep();
So every after 2 seconds, the console.log outputs "still connected", this is expected.
If the mongod.exe window is shutdown (to simulate a loss of connection to the database), I expect to see "some error" in the console.log. However, no error message is logged.
When the mongod.exe is restarted (to simulate a reconnection), the console.log outputs many streams of "still connected"
I have not been able to find the answer from the manual or other online sources. So 2 questions from me:
1) What is the current best practice to detect sudden mongodb disconnection apart from on.('close') and should there be an error being emitted using the find() query when such disconnection happens?
2) Upon reconnection, the console.log outputs many lines of the same messages as if they have been queued during the disconnection and now being logged at once. Is this normal? I can think of a real life issue: instead of find(), an admin maybe using insert() to add some data. The database connection is disrupted but there is no error message, so the admin maybe doing many insert() and still see no result. Then upon reconnection, a swarm of insert() queries inundates the database?
Please point at my ignorance...

How to handle Node/MongoDB connection management?

I'm using the node-mongodb-native to connect to a local MongoDB instance. I'm having a little trouble wrapping my head around how to handle the connections. I've attempted to abstract the MongoDB stuff into a custom Database module:
Database.js
var mongo = require('mongodb');
var Database = function() { return this; };
Database.prototype.doStuff = function doStuff(callback) {
mongo.connect('mongodb://127.0.0.1:27017/testdb', function(err, conn) {
conn.collection('test', function(err, coll) {
coll.find({}, function(err, cursor) {
cursor.toArray(function(err, items) {
conn.close();
return callback(err, items);
});
});
});
});
};
// Testing
new Database().doStuff(function(err, items) {
console.log(err, items);
});
Is a new connection required for each method? That seems like it would get expensive awfully quick. I imagined that perhaps the connection would be established in the constructor and subsequent calls would leverage the existing connection.
This next question may be more of a design question, but considering how connection setup and tear-down may be expensive operations, I'm considering adding a Database object that is global to my application that can be leveraged to make calls to the database. Does this seem reasonable?
Please note that the code above was roughly taken from here. Thanks for your help.
You don't need a new connection for each method - you can open it once and use it for subsequent calls. The same applies to the individual collection variables - you can cache the result of a single call to collection() and this will let you only need those callbacks once, leaving them out everywhere else.

Resources