The efficiency of continuously polling MongoDB in Node - node.js

I need to continuously update data on the client based on DB changes. I'm thinking about having a 5 second interval function that repeatedly gathers all the DB information and use Socket.IO to emit the data to the client.
Currently, I'm doing this on the client itself without socket.io, just repeatedly doing a REST call to the server which then handles the data.
My question is: Are either of these methods efficient or inefficient and is there a better solution to solve what I'm trying to achieve?

Ryan, you can try using MongoDB's collection.watch() which fires an event every time an update is made to a collection. You would need to do that within the socket connection event for it to work though. Something along these lines:
io.sockets.on('connection', function(socket) {
// when the socket is connected, start listening to MongoDB
const MongoClient = require("mongodb").MongoClient;
MongoClient.connect("mongodb://192.168.1.201")
.then(client => {
console.log("Connected correctly to server");
// specify db and collections
const db = client.db("your_db");
const collection = db.collection("your_collection");
const changeStream = collection.watch();
// start listening to changes
changeStream.on("change", function(change) {
console.log(change);
// this is where you can fire the socket.emit('the_change', change)
});
})
.catch(err => {
console.error(err);
});
});
Note that using this approach will require you to set up a replica set. You can follow those instructions or use a Dockerised replica set such as this one.

I need more details to make sure but it doesn't sound like a good solution.
If the data you need does not change rapidly, like let's say in seconds, each of your connection still polling every 5 seconds and that's kind of wasting.
In that case you might just trigger an event where the data got changed, then you can push the message through sockets that are active.

Related

Socket IO makes multiple connections when the page is refreshed - Node JS

I have developed a scrapping tool, that scraps jobs from all websites and save them into the database. I have made my own default log where I get messages(errors, info) etc. I am using socket.io to update my view in real time and for database too.
The problem is when I start the app it perfectly get make socket, and database connections. But when I try to refresh the page, the same connection is made again twice with the same message and different ID's. As much I refresh the page the connections are made, and the id get changed, but for all made connection they use one ID,
Below is the Log which shows it :
I have uploaded this video, please check this as well. Try to watch the very beginning, and then at 01:41 and 03:06, before starting scrapping of the first site the connection is established, but when second website scrapping is started, the Internet Connection message is given twice, and the same stands for when third website scrapping is started, the number of messages get doubled every time. I don't know why.
I have tried following the answer of this question, but still no success. The code is 600+ lines on server file, and 150+ lines second file and same on the client side, that's why I can't upload all and it's a bit confidential.
But the socket connection on the client and server is like this:
Server Side
const express = require("express");
const app = express();
const scrap = require("./algorithm");
const event = scrap.defEvent;//imported from another file
const ms_connect = scrap.ms_connect;
const server = app.listen(8000, function(){ console.log('Listening on 8000'); });
const io = require("socket.io").listen(server);
const internetAvailable = require("internet-available");
app.use(express.static(__dirname + "/"));
app.get("/scrap",function(req,res){
res.sendFile(__dirname+"/index.html");//Set the Default Route
io.on("connection",function(socket){ //On Socket Connection
socketSameMess("Socket",'Sockets Connection Made on ID : <span style="color:#03a9f4;">'+socket.id+'<span>');
ms_connect.connect(function(err){//On Connection with Database
if(err) socketSameMess("database_error",err+" "); // If any error in database connection
socketSameMess("Database ",'Connected to MYSQL Database Successfully...');
})
})
})
function eventSameMess(auth,mess){
//hits the custom console
defEvent.emit("hitConsole",{a:auth,m:mess});
}
Client Side
var socket = io.connect('http://localhost:8000');
socket.on('connect',function(){
if(socket.connected){
initDocument();
}
})
Getting multiple messages
Here are some thumb rules for socketio
if you listen to any event once, you'll get the message once in the callback
if you listen to any event twice, you'll get the message twice in the callback
if you listen to any event nth time, you'll get the message nth in the callback
If you're listening to any event on page load, don't forget to listen off that event before you leave the page (if an event is not globally)
If you forgot to listen off and if you again re-visit page. you'll start listening to events multiple times. because on page load you're listening to the event. and the previous event is not yet stopped by listen off
Don't listen to any event in loop, It may listen to it multiple time and you'll get multiple messages on a single change.
connect socket
const socket = io('http://localhost', {
transports: ['websocket'],
upgrade: false
});
listen and listen off an event
let onMessage = (data) => {
console.log(data);
}
//listen message
socket.on('message', onMessage);
//stop listening message
socket.off('message', onMessage);
remove all listeners on disconnect
socket.on('disconnect', () => {
socket.removeAllListeners();
});
Use Firecamp to test/debug SocketIO events visually.
i was having the problem that each client was getting two socket connections. I thought something is wrong with sockets.
but the problem was
FrontEnd -> React
Created the template using create-react-app
In index.js file it uses something called React.StrictMode
This mode renders some of the App.js components two times.
Just remove that React.StrictMode and try to see if your problem is solved.

Efficient Socket.io distribution with Mongoose stream

I'm trying to create an efficient streaming node.js app, where the server would connect to a stream (capped collection) in MongoDB with mongoose, and then emit the stream directly to the client browsers.
What I'm worried about is the scalability of my design. Let me know if I'm wrong, but it seems that right now, for every new web browser that is opened, a new connection to MongoDB will also be opened (it won't re-use the previously utilized connection), and therefore there will be a lot of inefficiencies if I have a lot of user connected at the same time. How can I improve that?
I'm thinking of a one server - multiple client type of design in socket.io but I don't know how to achieve that.
Code below:
server side (app.js):
io.on('connection', function (socket) {
console.log("connected!");
var stream = Json.find().lean().tailable({ "awaitdata": true, numberOfRetries: Number.MAX_VALUE}).stream();
stream.on('data', function(doc){
socket.emit('rmc', doc);
}).on('error', function (error){
console.log(error);
}).on('close', function () {
console.log('closed');
});
});
client side (index.html):
socket.on('rmc', function(json) {
doSomething(); // it just displays the data on the screen
});
Unfortunately this will not depend only on mongo performance . unless you have a high level of concurrency (+1000 streams) you shouldn't worry about mongo (for the moment).
because with that kind of app you have bigger problems example: the data type and compression , buffer overflows , bandwith limit , socket.io limits , os limits . These are the kind of problems you will most likely face first.
now to answer your question. As far as i know no you are not opening a connection to mongo per user. the users are connected to the app not the database . the app is connected with the database.
lastly , these links will help you understand and tweak your queries for this kind of job (streaming)
https://github.com/Automattic/mongoose/issues/1248
https://codeandcodes.com/tag/mongoose-vs-mongodb-native/
http://drewww.github.io/socket.io-benchmarking/
hope it helps !

Connection to Mongodb-Native-Driver in express.js

I am using mongodb-native-driver in express.js app. I have around 6 collections in the database, so I have created 6 js files with each having a collection as a javascript object (e.g function collection(){}) and the prototypes functions handling all the manipulation on those collections. I thought this would be a good architecture.
But the problem I am having is how to connect to the database? Should I create a connection in each of this files and use them? I think that would be an overkill as the connect in mongodb-native-driver creates a pool of connections and having several of them would not be justified.
So how do I create a single connection pool and use it in all the collections.js files? I want to have the connection like its implemented in mongoose. Let me know if any of my thought process in architecture of the app is wrong.
Using Mongoose would solve these problems, but I have read in several places thats it slower than native-driver and also I would prefer a schema-less models.
Edit: I created a module out of models. Each collection was in a file and it took the database as an argument. Now in the index.js file I called the database connection and kept a variable db after I got the database from the connection. (I used the auto-reconnect feature to make sure that the connection wasn't lost). In the same index.js file I exported each of the collections like this
exports.model1 = require('./model1').(db)
exprorts.model2 = require('./model2').(db)
This ensured that the database part was handled in just one module and the app would just call function that each model.js file exported like save(), fincdbyid() etc (whatever you do in the function is upto you to implement).
how to connect to the database?
In order to connect using the MongoDB native driver you need to do something like the following:
var util = require('util');
var mongodb = require('mongodb');
var client = mongodb.MongoClient;
var auth = {
user: 'username',
pass: 'password',
host: 'hostname',
port: 1337,
name: 'databaseName'
};
var uri = util.format('mongodb://%s:%s#%s:%d/%s',
auth.user, auth.pass, auth.host, auth.port, auth.name);
/** Connect to the Mongo database at the URI using the client */
client.connect(uri, { auto_reconnect: true }, function (err, database) {
if (err) throw err;
else if (!database) console.log('Unknown error connecting to database');
else {
console.log('Connected to MongoDB database server at:');
console.log('\n\t%s\n', uri);
// Create or access collections, etc here using the database object
}
});
A basic connection is setup like this. This is all I can give you going on just the basic description of what you want. Post up some code you've got so far to get more specific help.
Should I create a connection in each of this files and use them?
No.
So how do I create a single connection pool and use it in all the collections.js files?
You can create a single file with code like the above, lets call it dbmanager.js connecting to the database. Export functions like createUser, deleteUser, etc. which operate on your database, then export functionality like so:
module.exports = {
createUser: function () { ; },
deleteUser: function () { ; }
};
which you could then require from another file like so:
var dbman = require('./dbmanager');
dbman.createUser(userData); // using connection established in `dbmanager.js`
EDIT: Because we're dealing with JavaScript and a single thread, the native driver indeed automatically handles connection pooling for you. You can look for this in the StackOverflow links below for more confirmation of this. The OP does state this in the question as well. This means that client.connect should be called only once by an instance of your server. After the database object is successfully retrieved from a call to client.connect, that database object should be reused throughout the entire instance of your app. This is easily accomplished by using the module pattern that Node.JS provides.
My suggestion is to create a module or set of modules which serves as a single point of contact for interacting with the database. In my apps I usually have a single module which depends on the native driver, calling require('mongodb'). All other modules in my app will not directly access the database, but instead all manipulations must be coordinated by this database module.
This encapsulates all of the code dealing with the native driver into a single module or set of modules. The OP seems to think there is a problem with the simple code example I've posted, describing a problem with a "single large closure" in my example. This is all pretty basic stuff, so I'm adding clarification as to the basic architecture at work here, but I still do not feel the need to change any code.
The OP also seems to think that multiple connections could possibly be made here. This is not possible with this setup. If you created a module like I suggest above then the first time require('./dbmanager') is called it will execute the code in the file dbmanager.js and return the module.exports object. The exports object is cached and is also returned on each subsequent call to require('./dbmanager'), however, the code in dbmanager.js will only be executed the first require.
If you don't want to create a module like this then the other option would be to export only the database passed to the callback for client.connect and use it directly in different places throughout your app. I recommend against this however, regardless of the OPs concerns.
Similar, possibly duplicate Stackoverflow questions, among others:
How to manage mongodb connections in nodejs webapp
Node.JS and MongoDB, reusing the DB object
Node.JS - What is the right way to deal with MongoDB connections
As accepted answer says - you should create only one connection for all incoming requests and reuse it, but answer is missing solution, that will create and cache connection. I wrote express middleware to achieve this - express-mongo-db. At first sight this task is trivial, and most people use this kind of code:
var db;
function createConnection(req, res, next) {
if (db) { req.db = db; next(); }
client.connect(uri, { auto_reconnect: true }, function (err, database) {
req.db = db = databse;
next();
});
}
app.use(createConnection);
But this code lead you to connection-leak, when multiple request arrives at the same time, and db is undefined. express-mongo-db solving this by holding incoming clients and calling connect only once, when module is required (not when first request arrives).
Hope you find it useful.
I just thought I would add in my own method of MongoDB connection for others interested or having problems with different methods
This method assumes you don't need authentication(I use this on localhost)
Authentication is still easy to implement
var MongoClient = require('mongodb').MongoClient;
var Server = require('mongodb').Server;
var client = new MongoClient(new Server('localhost',27017,{
socketOptions: {connectTimeoutMS: 500},
poolSize:5,
auto_reconnect:true
}, {
numberOfRetries:3,
retryMilliseconds: 500
}));
client.open(function(err, client) {
if(err) {
console.log("Connection Failed Via Client Object.");
} else {
var db = client.db("theDbName");
if(db) {
console.log("Connected Via Client Object . . .");
db.logout(function(err,result) {
if(!err) {
console.log("Logged out successfully");
}
client.close();
console.log("Connection closed");
});
}
}
});
Credit goes to Brad Davley which goes over this method in his book (page 231-232)

Postgresql connection timed out in node.js and pg

I am new to node, postgresql, and to the whole web development business. I am currently writing a simple app which connects to a postgres database and display the content of a table in a web view. The app will be hosted in OpenShift.
My main entry is in server.js:
var pg = require('pg');
pg.connect(connection_string, function(err, client) {
// handle error
// save client: app.client = client;
});
Now, to handle the GET / request:
function handle_request(req, res){
app.client.query('...', function(err, result){
if (err) throw err; // Will handle error later, crash for now
res.render( ... ); // Render the web view with the result
});
}
My app seems to work: the table is rendered in the web view correctly, and it works for multiple connections (different web clients from different devices). However, if there is no request for a couple of minutes, then subsequent request will crash the app with time out information. Here is the stack information:
/home/hai/myapp/server.js:98
if (err) throw err;
^
Error: This socket is closed.
at Socket._write (net.js:474:19)
at Socket.write (net.js:466:15)
at [object Object].query (/home/hai/myapp/node_modules/pg/lib/connection.js:109:15)
at [object Object].submit (/home/hai/myapp/node_modules/pg/lib/query.js:99:16)
at [object Object]._pulseQueryQueue (/home/hai/myapp/node_modules/pg/lib/client.js:166:24)
at [object Object].query (/home/hai/myapp/node_modules/pg/lib/client.js:193:8)
at /home/hai/myapp/server.js:97:17
at callbacks (/home/hai/myapp/node_modules/express/lib/router/index.js:160:37)
at param (/home/hai/myapp/node_modules/express/lib/router/index.js:134:11)
at pass (/home/hai/myapp/node_modules/express/lib/router/index.js:141:5)
Is there a way to keep the connection from timed out (better)? Or to reconnect on demand (best)? I have tried to redesign my app by not connecting to the database in the beginning, but upon the GET / request. This solution works only for the first request, then crashed on the second. Any insight is appreciated.
Have you looked into the postgres keepalive setting values? It sends packets to keep idle connections from timing out.
http://www.postgresql.org/docs/9.1/static/runtime-config-connection.html
I also found this similar question:
How to use tcp_keepalives settings in Postgresql?
You could also perform really minor queries from the db at a set interval. However, this method is definitely more hacked.
Edit: You could also try initiating the client like this:
var client = new pg.Client(conString);
Before you make your queries, you can check if the client is still connected. I believe you can use:
if(client.connection._events != null)
client.connect();
faced the same problem.. telling the client to close connection upon the end event
query.on('end', function() {
client.end();
});
did the trick for me...
You can also change the default idle timeout of 30 seconds to whatever value you need. E.g.
pg.defaults.poolIdleTimeout = 600000; // 10 mins
I'm using the parameter keepAlive in true and it works.
This is my configuration and it is solved.
const client_pg = new Client({
connectionString,
keepAlive: true,
keepAliveInitialDelayMillis: 10000
});

How to handle Mongoose DB connection interruptions

I've been evaluating Mongoose (an ORM for node.js which uses MongoDB for persistent storage).
What I'd like to do is make sure that the app can run when the DB is not up when the app starts, and also handles the DB going down intelligently.
Currently my test app which does not work in either case does this:
var mongoose_connection = mongoose.createConnection(DATABASE_URL, {server:{poolSize:4}});
Then I use that connection when making Models.
In the case when the DB is down at the start of the app then any save() calls on instances silently fail with no error. If the DB comes back up they never get written.
So I would need to detect that the connection never happened and have the app be able to tell that at runtime so that I can handle it somehow.
When the DB goes down after the app has started though the save() calls still do not cause errors but they are queued and the written when the DB comes back.
That seems fine except I'd like to hook into the API to get events when the DB is down and to query how many save's are queued. At some point I might have so many queued events that I would want to just stop making new ones and have the app back off.
Case #1: db is down on app startup. there is a minor bug preventing this use case that I'm fixing now. However, here is the work-around:
var db = mongoose.createConnection();
db.on('error', function (err) {
if (err) // couldn't connect
// hack the driver to allow re-opening after initial network error
db.db.close();
// retry if desired
connect();
});
function connect () {
db.open('localhost', 'dbname');
}
connect();
https://gist.github.com/2878607
An ugly but working gist. First shutdown mongo, then run this gist.
Notice the connection failures.
Then start mongo and see all of the queued up inserts complete and dumped to the console.
Writes and finds will begin.
Shutdown mongo, notice that the inserts and finds are being attempted but no callbacks are executed.
Restart mongo. Notice all of the queued inserts and finds are completed.
If you would rather fail all the requests to the server when the db is down the native driver does emit the reconnect event which can be sensed in a middleware.
This works and emits the reconnect event fine (mongodb native driver 1.3.23)
mongoose.connection.db.on('reconnect', function (ref) {
connected=true;
console.log('reconnect to mongo server.');
});
So my dbconnection middleware looks for connected/error/reconnect
(some of the events are redundant but does not harm !)
PS. the initial connect failure needs to still be handled by a retry as
aaronheckmann's answer above.
mongoose.connection.on('open', function (ref) {
connected=true;
console.log('open connection to mongo server.');
});
mongoose.connection.on('connected', function (ref) {
connected=true;
console.log('connected to mongo server.');
});
mongoose.connection.on('disconnected', function (ref) {
connected=false;
console.log('disconnected from mongo server.');
});
mongoose.connection.on('close', function (ref) {
connected=false;
console.log('close connection to mongo server');
});
mongoose.connection.on('error', function (err) {
connected=false;
console.log('error connection to mongo server!');
console.log(err);
});
mongoose.connection.db.on('reconnect', function (ref) {
connected=true;
console.log('reconnect to mongo server.');
});

Resources