Related
I'm a bit stuck here and was hoping to get some help.
My node application has a seperate module where I connect to postgres and export the pool as so
const {Pool,Client} = require('pg');
const pool = new Pool({
user: process.env.POSTGRES_USER,
host: process.env.POSTGRES_URL,
database: process.env.POSTGRES_DATABASE,
password: process.env.POSTGRES_PASSWORD,
port: process.env.POSTGRES_PORT,
keepAlive: 0,
ssl:{ rejectUnauthorized: false,
sslmode:require},
connectionTimeoutMillis: 10000, // 10 seconds
allowExitOnIdle:true,
max: 10
});
pool.connect()
.then(() => console.log('postgress connected'))
.catch(err => console.error(err))
module.exports = pool
On my route, I have redis cache as middleware, this works as expected and can confirm it is being served up by redis, the logic in the route does not run when the request is cached, however I was doing some load testing to see how everything would handle spikes and noticed I started to get errors from postgres
Error: timeout exceeded when trying to connect
I also got errors talking about max connections etc.
I have tried to increase the max pool connection but still seem to get this error when running some larger load tests.
My question is, why, would PG be trying to connect if the connection should be shared? Additionally, why is it even trying to connect if the request is cached?
Any help would be appreciated!
Apparently some of your stress test cases are missing the redis cache. You haven't shown any code relevant to that, so what more can be said?
The error you show is not generated by PostgreSQL, it is generated by node's 'pg' module. You configured it to only allow 10 simultaneous connections. If more than that are requested, they have to line up and wait. And you also configured it to wait only for 10 seconds before bombing out with an error, and that is exactly what you are seeing.
You vaguely allude to other errors, but you would have to share the actual error message with us if you want help.
The system seems to be operating as designed. You did a stress test to see what would happen, and you have seen what happens.
I'm running an Express.js application using Socket.io for a chat webapp
and I get the following error randomly around 5 times during 24h.
The node process is wrapped in forever and it restarts itself immediately.
The problem is that restarting Express kicks my users out of their rooms
and nobody wants that.
The web server is proxied by HAProxy. There are no socket stability issues,
just using websockets and flashsockets transports.
I cannot reproduce this on purpose.
This is the error with Node v0.10.11:
events.js:72
throw er; // Unhandled 'error' event
^
Error: read ECONNRESET //alternatively it s a 'write'
at errnoException (net.js:900:11)
at TCP.onread (net.js:555:19)
error: Forever detected script exited with code: 8
error: Forever restarting script for 2 time
EDIT (2013-07-22)
Added both socket.io client error handler and the uncaught exception handler.
Seems that this one catches the error:
process.on('uncaughtException', function (err) {
console.error(err.stack);
console.log("Node NOT Exiting...");
});
So I suspect it's not a Socket.io issue but an HTTP request to another server
that I do or a MySQL/Redis connection. The problem is that the error stack
doesn't help me identify my code issue. Here is the log output:
Error: read ECONNRESET
at errnoException (net.js:900:11)
at TCP.onread (net.js:555:19)
How do I know what causes this? How do I get more out of the error?
Ok, not very verbose but here's the stacktrace with Longjohn:
Exception caught: Error ECONNRESET
{ [Error: read ECONNRESET]
code: 'ECONNRESET',
errno: 'ECONNRESET',
syscall: 'read',
__cached_trace__:
[ { receiver: [Object],
fun: [Function: errnoException],
pos: 22930 },
{ receiver: [Object], fun: [Function: onread], pos: 14545 },
{},
{ receiver: [Object],
fun: [Function: fireErrorCallbacks],
pos: 11672 },
{ receiver: [Object], fun: [Function], pos: 12329 },
{ receiver: [Object], fun: [Function: onread], pos: 14536 } ],
__previous__:
{ [Error]
id: 1061835,
location: 'fireErrorCallbacks (net.js:439)',
__location__: 'process.nextTick',
__previous__: null,
__trace_count__: 1,
__cached_trace__: [ [Object], [Object], [Object] ] } }
Here I serve the flash socket policy file:
net = require("net")
net.createServer( (socket) =>
socket.write("<?xml version=\"1.0\"?>\n")
socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
socket.write("<cross-domain-policy>\n")
socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
socket.write("</cross-domain-policy>\n")
socket.end()
).listen(843)
Can this be the cause?
You might have guessed it already: it's a connection error.
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
But since you are also looking for a way to check the error and potentially debug the problem, you should take a look at "How to debug a socket hang up error in NodeJS?" which was posted at stackoverflow in relation to an alike question.
Quick and dirty solution for development:
Use longjohn, you get long stack traces that will contain the async operations.
Clean and correct solution:
Technically, in node, whenever you emit an 'error' event and no one listens to it, it will throw. To make it not throw, put a listener on it and handle it yourself. That way you can log the error with more information.
To have one listener for a group of calls you can use domains and also catch other errors on runtime. Make sure each async operation related to http(Server/Client) is in different domain context comparing to the other parts of the code, the domain will automatically listen to the error events and will propagate it to its own handler. So you only listen to that handler and get the error data. You also get more information for free.
EDIT (2013-07-22)
As I wrote above:
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
What could also be the case: at random times, the other side is overloaded and simply kills the connection as a result. If that's the case, depends on what you're connecting to exactly…
But one thing's for sure: you indeed have a read error on your TCP connection which causes the exception. You can see that by looking at the error code you posted in your edit, which confirms it.
A simple tcp server I had for serving the flash policy file was causing this. I can now catch the error using a handler:
# serving the flash policy file
net = require("net")
net.createServer((socket) =>
//just added
socket.on("error", (err) =>
console.log("Caught flash policy server socket error: ")
console.log(err.stack)
)
socket.write("<?xml version=\"1.0\"?>\n")
socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
socket.write("<cross-domain-policy>\n")
socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
socket.write("</cross-domain-policy>\n")
socket.end()
).listen(843)
I had a similar problem where apps started erroring out after an upgrade of Node. I believe this can be traced back to Node release v0.9.10 this item:
net: don't suppress ECONNRESET (Ben Noordhuis)
Previous versions wouldn't error out on interruptions from the client. A break in the connection from the client throws the error ECONNRESET in Node. I believe this is intended functionality for Node, so the fix (at least for me) was to handle the error, which I believe you did in unCaught exceptions. Although I handle it in the net.socket handler.
You can demonstrate this:
Make a simple socket server and get Node v0.9.9 and v0.9.10.
require('net')
.createServer( function(socket)
{
// no nothing
})
.listen(21, function()
{
console.log('Socket ON')
})
Start it up using v0.9.9 and then attempt to FTP to this server. I'm using FTP and port 21 only because I'm on Windows and have an FTP client, but no telnet client handy.
Then from the client side, just break the connection. (I'm just doing Ctrl-C)
You should see NO ERROR when using Node v0.9.9, and ERROR when using Node v.0.9.10 and up.
In production, I use v.0.10. something and it still gives the error. Again, I think this is intended and the solution is to handle the error in your code.
Had the same problem today.
After some research i found a very useful --abort-on-uncaught-exception node.js option. Not only it provides much more verbose and useful error stack trace, but also saves core file on application crash allowing further debug.
I also get ECONNRESET error during my development, the way I solve it is by not using nodemon to start my server, just use "node server.js" to start my server fixed my problem.
It's weird, but it worked for me, now I never see the ECONNRESET error again.
I was facing the same issue but I mitigated it by placing:
server.timeout = 0;
before server.listen. server is an HTTP server here. The default timeout is 2 minutes as per the API documentation.
Yes, your serving of the policy file can definitely cause the crash.
To repeat, just add a delay to your code:
net.createServer( function(socket)
{
for (i=0; i<1000000000; i++) ;
socket.write("<?xml version=\"1.0\"?>\n");
…
… and use telnet to connect to the port. If you disconnect telnet before the delay has expired, you'll get a crash (uncaught exception) when socket.write throws an error.
To avoid the crash here, just add an error handler before reading/writing the socket:
net.createServer(function(socket)
{
for(i=0; i<1000000000; i++);
socket.on('error', function(error) { console.error("error", error); });
socket.write("<?xml version=\"1.0\"?>\n");
}
When you try the above disconnect, you'll just get a log message instead of a crash.
And when you're done, remember to remove the delay.
Another possible case (but rare) could be if you have server to server communications and have set server.maxConnections to a very low value.
In node's core lib net.js it will call clientHandle.close() which will also cause error ECONNRESET:
if (self.maxConnections && self._connections >= self.maxConnections) {
clientHandle.close(); // causes ECONNRESET on the other end
return;
}
ECONNRESET occurs when the server side closes the TCP connection and your request to the server is not fulfilled. The server responds with the message that the connection, you are referring to a invalid connection.
Why the server sends a request with invalid connection?
Suppose you have enabled a keep-alive connection between client and server. The keep-alive timeout is configured to 15 seconds. This means that if keep-alive is idle for 15 seconds, it will send connection close request. So after 15 seconds, server tells the client to close the connection. BUT, when server is sending this request, client is sending a new request which is already on flight to the server end. Since this connection is invalid now, server will reject with ECONNRESET error. So the problem occurs due to fewer requests to the server end. So please disable keep-alive and it will work fine.
I had this Error too and was able to solve it after days of debugging and analysis:
my solution
For me VirtualBox (for Docker) was the Problem. I had Port Forwarding configured on my VM and the error only occured on the forwarded port.
general conclusions
The following observations may save you days of work I had to invest:
For me the problem only occurred on connections from localhost to localhost on one port. -> check changing any of these constants solves the problem.
For me the problem only occurred on my machine -> let someone else try it.
For me the problem only occurred after a while and couldn't be reproduced reliably
My Problem couldn't be inspected with any of nodes or expresses (debug-)tools. -> don't waste time on this
-> figure out if something is messing around with your network (-settings), like VMs, Firewalls etc., this is probably the cause of the problem.
I solved the problem by simply connecting to a different network. That is one of the possible problems.
As discussed above, ECONNRESET means that the TCP conversation abruptly closed its end of the connection.
Your internet connection might be blocking you from connecting to some servers. In my case, I was trying to connect to mLab ( cloud database service that hosts MongoDB databases). And my ISP is blocking it.
I had resolved this problem by:
Turning off my wifi/ethernet connection and turn on.
I typed: npm update in terminal to update npm.
I tried to log out from the session and log in again
After that I tried the same npm command and the good thing was it worked out. I wasn't sure it is that simple.
I am using CENTOS 7
I just figured this out, at least in my use case.
I was getting ECONNRESET. It turned out that the way my client was set up, it was hitting the server with an API call a ton of times really quickly -- and it only needed to hit the endpoint once.
When I fixed that, the error was gone.
I had the same issue and it appears that the Node.js version was the problem.
I installed the previous version of Node.js (10.14.2) and everything was ok using nvm (allow you to install several version of Node.js and quickly switch from a version to another).
It is not a "clean" solution, but it can serve you temporarly.
Try adding these options to socket.io:
const options = { transports: ['websocket'], pingTimeout: 3000, pingInterval: 5000 };
I hope this will help you !
Node JS socket is non-blocking io. Consider using a non-blocking io connection from other sources. For instance, if you use a blocking Java socket with node it will only work for a few seconds after which the error will be served. Mitigate this by implementing a non-blocking connection I.e. socketchannel with the selector.
First I run my app I got ECONNRESET after that I got error like ECONNREFUSED . I had faced both of this problem while running my node app.For both of the Problem, I found that this was occuring because of not starting the wampserver.I am using mysql database in my app for getting the data with the help of wampserver. I resolve this by starting the wampserver and then after running my node app. It works fine.You can use node or nodemon for running the node application It's not the problem in my case.
Few options I tried and worked as a temporary solutions
If using node, try to switch between different node versions using node use #version#. Worked for me
Try switching internet connection
This is the extension of my last question here:
socket.io always has connection false
For now, I have two servers deployed in two different domain names. The first server works perfectly fine with socket.io, so I redeployed the server to the new domain name by simply pull the same branch from GitHub, install everything and run it. And then I found that all my socket.io connections failed and the symptom is exactly the same as last time: connection always 'false' and disconnection always 'true'.
This time I am pretty sure it is not related with cors, because I tried io.origins((origin, cb) => if (whitelist.includes(origin)) { cb(null, true) } else { cb('failed', false) ) and it shows the origin is allowed.
I also tried cors: { origin: '*' } and that also doesn't work.
Strangely, despite the fact that they are using the same code, connecting to the first domain name works perfectly fine. But the second one has the issue.
UPDATE:
I use this to track the error message.
this.socket.on('connect_error', function(err) {
console.log(`connect_error due to ${err.message}`);
});
And this is what is returned:
connect_error due to server error
In the mean time, on server side I can see nothing other than the new connections being created and being disconnected due to ping timeout.
Where can I find more information to help me debug?
Turns out I forgot to run npm i after deploying to a new domain name. The version of socket.io wasn't updated and caused that issue.
Everyone who sees this, just make sure you have updated the package before you give up on debugging.
I'm running an Express.js application using Socket.io for a chat webapp
and I get the following error randomly around 5 times during 24h.
The node process is wrapped in forever and it restarts itself immediately.
The problem is that restarting Express kicks my users out of their rooms
and nobody wants that.
The web server is proxied by HAProxy. There are no socket stability issues,
just using websockets and flashsockets transports.
I cannot reproduce this on purpose.
This is the error with Node v0.10.11:
events.js:72
throw er; // Unhandled 'error' event
^
Error: read ECONNRESET //alternatively it s a 'write'
at errnoException (net.js:900:11)
at TCP.onread (net.js:555:19)
error: Forever detected script exited with code: 8
error: Forever restarting script for 2 time
EDIT (2013-07-22)
Added both socket.io client error handler and the uncaught exception handler.
Seems that this one catches the error:
process.on('uncaughtException', function (err) {
console.error(err.stack);
console.log("Node NOT Exiting...");
});
So I suspect it's not a Socket.io issue but an HTTP request to another server
that I do or a MySQL/Redis connection. The problem is that the error stack
doesn't help me identify my code issue. Here is the log output:
Error: read ECONNRESET
at errnoException (net.js:900:11)
at TCP.onread (net.js:555:19)
How do I know what causes this? How do I get more out of the error?
Ok, not very verbose but here's the stacktrace with Longjohn:
Exception caught: Error ECONNRESET
{ [Error: read ECONNRESET]
code: 'ECONNRESET',
errno: 'ECONNRESET',
syscall: 'read',
__cached_trace__:
[ { receiver: [Object],
fun: [Function: errnoException],
pos: 22930 },
{ receiver: [Object], fun: [Function: onread], pos: 14545 },
{},
{ receiver: [Object],
fun: [Function: fireErrorCallbacks],
pos: 11672 },
{ receiver: [Object], fun: [Function], pos: 12329 },
{ receiver: [Object], fun: [Function: onread], pos: 14536 } ],
__previous__:
{ [Error]
id: 1061835,
location: 'fireErrorCallbacks (net.js:439)',
__location__: 'process.nextTick',
__previous__: null,
__trace_count__: 1,
__cached_trace__: [ [Object], [Object], [Object] ] } }
Here I serve the flash socket policy file:
net = require("net")
net.createServer( (socket) =>
socket.write("<?xml version=\"1.0\"?>\n")
socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
socket.write("<cross-domain-policy>\n")
socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
socket.write("</cross-domain-policy>\n")
socket.end()
).listen(843)
Can this be the cause?
You might have guessed it already: it's a connection error.
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
But since you are also looking for a way to check the error and potentially debug the problem, you should take a look at "How to debug a socket hang up error in NodeJS?" which was posted at stackoverflow in relation to an alike question.
Quick and dirty solution for development:
Use longjohn, you get long stack traces that will contain the async operations.
Clean and correct solution:
Technically, in node, whenever you emit an 'error' event and no one listens to it, it will throw. To make it not throw, put a listener on it and handle it yourself. That way you can log the error with more information.
To have one listener for a group of calls you can use domains and also catch other errors on runtime. Make sure each async operation related to http(Server/Client) is in different domain context comparing to the other parts of the code, the domain will automatically listen to the error events and will propagate it to its own handler. So you only listen to that handler and get the error data. You also get more information for free.
EDIT (2013-07-22)
As I wrote above:
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
What could also be the case: at random times, the other side is overloaded and simply kills the connection as a result. If that's the case, depends on what you're connecting to exactly…
But one thing's for sure: you indeed have a read error on your TCP connection which causes the exception. You can see that by looking at the error code you posted in your edit, which confirms it.
A simple tcp server I had for serving the flash policy file was causing this. I can now catch the error using a handler:
# serving the flash policy file
net = require("net")
net.createServer((socket) =>
//just added
socket.on("error", (err) =>
console.log("Caught flash policy server socket error: ")
console.log(err.stack)
)
socket.write("<?xml version=\"1.0\"?>\n")
socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n")
socket.write("<cross-domain-policy>\n")
socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n")
socket.write("</cross-domain-policy>\n")
socket.end()
).listen(843)
I had a similar problem where apps started erroring out after an upgrade of Node. I believe this can be traced back to Node release v0.9.10 this item:
net: don't suppress ECONNRESET (Ben Noordhuis)
Previous versions wouldn't error out on interruptions from the client. A break in the connection from the client throws the error ECONNRESET in Node. I believe this is intended functionality for Node, so the fix (at least for me) was to handle the error, which I believe you did in unCaught exceptions. Although I handle it in the net.socket handler.
You can demonstrate this:
Make a simple socket server and get Node v0.9.9 and v0.9.10.
require('net')
.createServer( function(socket)
{
// no nothing
})
.listen(21, function()
{
console.log('Socket ON')
})
Start it up using v0.9.9 and then attempt to FTP to this server. I'm using FTP and port 21 only because I'm on Windows and have an FTP client, but no telnet client handy.
Then from the client side, just break the connection. (I'm just doing Ctrl-C)
You should see NO ERROR when using Node v0.9.9, and ERROR when using Node v.0.9.10 and up.
In production, I use v.0.10. something and it still gives the error. Again, I think this is intended and the solution is to handle the error in your code.
Had the same problem today.
After some research i found a very useful --abort-on-uncaught-exception node.js option. Not only it provides much more verbose and useful error stack trace, but also saves core file on application crash allowing further debug.
I also get ECONNRESET error during my development, the way I solve it is by not using nodemon to start my server, just use "node server.js" to start my server fixed my problem.
It's weird, but it worked for me, now I never see the ECONNRESET error again.
I was facing the same issue but I mitigated it by placing:
server.timeout = 0;
before server.listen. server is an HTTP server here. The default timeout is 2 minutes as per the API documentation.
Yes, your serving of the policy file can definitely cause the crash.
To repeat, just add a delay to your code:
net.createServer( function(socket)
{
for (i=0; i<1000000000; i++) ;
socket.write("<?xml version=\"1.0\"?>\n");
…
… and use telnet to connect to the port. If you disconnect telnet before the delay has expired, you'll get a crash (uncaught exception) when socket.write throws an error.
To avoid the crash here, just add an error handler before reading/writing the socket:
net.createServer(function(socket)
{
for(i=0; i<1000000000; i++);
socket.on('error', function(error) { console.error("error", error); });
socket.write("<?xml version=\"1.0\"?>\n");
}
When you try the above disconnect, you'll just get a log message instead of a crash.
And when you're done, remember to remove the delay.
Another possible case (but rare) could be if you have server to server communications and have set server.maxConnections to a very low value.
In node's core lib net.js it will call clientHandle.close() which will also cause error ECONNRESET:
if (self.maxConnections && self._connections >= self.maxConnections) {
clientHandle.close(); // causes ECONNRESET on the other end
return;
}
ECONNRESET occurs when the server side closes the TCP connection and your request to the server is not fulfilled. The server responds with the message that the connection, you are referring to a invalid connection.
Why the server sends a request with invalid connection?
Suppose you have enabled a keep-alive connection between client and server. The keep-alive timeout is configured to 15 seconds. This means that if keep-alive is idle for 15 seconds, it will send connection close request. So after 15 seconds, server tells the client to close the connection. BUT, when server is sending this request, client is sending a new request which is already on flight to the server end. Since this connection is invalid now, server will reject with ECONNRESET error. So the problem occurs due to fewer requests to the server end. So please disable keep-alive and it will work fine.
I had this Error too and was able to solve it after days of debugging and analysis:
my solution
For me VirtualBox (for Docker) was the Problem. I had Port Forwarding configured on my VM and the error only occured on the forwarded port.
general conclusions
The following observations may save you days of work I had to invest:
For me the problem only occurred on connections from localhost to localhost on one port. -> check changing any of these constants solves the problem.
For me the problem only occurred on my machine -> let someone else try it.
For me the problem only occurred after a while and couldn't be reproduced reliably
My Problem couldn't be inspected with any of nodes or expresses (debug-)tools. -> don't waste time on this
-> figure out if something is messing around with your network (-settings), like VMs, Firewalls etc., this is probably the cause of the problem.
I solved the problem by simply connecting to a different network. That is one of the possible problems.
As discussed above, ECONNRESET means that the TCP conversation abruptly closed its end of the connection.
Your internet connection might be blocking you from connecting to some servers. In my case, I was trying to connect to mLab ( cloud database service that hosts MongoDB databases). And my ISP is blocking it.
I had resolved this problem by:
Turning off my wifi/ethernet connection and turn on.
I typed: npm update in terminal to update npm.
I tried to log out from the session and log in again
After that I tried the same npm command and the good thing was it worked out. I wasn't sure it is that simple.
I am using CENTOS 7
I just figured this out, at least in my use case.
I was getting ECONNRESET. It turned out that the way my client was set up, it was hitting the server with an API call a ton of times really quickly -- and it only needed to hit the endpoint once.
When I fixed that, the error was gone.
I had the same issue and it appears that the Node.js version was the problem.
I installed the previous version of Node.js (10.14.2) and everything was ok using nvm (allow you to install several version of Node.js and quickly switch from a version to another).
It is not a "clean" solution, but it can serve you temporarly.
Try adding these options to socket.io:
const options = { transports: ['websocket'], pingTimeout: 3000, pingInterval: 5000 };
I hope this will help you !
Node JS socket is non-blocking io. Consider using a non-blocking io connection from other sources. For instance, if you use a blocking Java socket with node it will only work for a few seconds after which the error will be served. Mitigate this by implementing a non-blocking connection I.e. socketchannel with the selector.
First I run my app I got ECONNRESET after that I got error like ECONNREFUSED . I had faced both of this problem while running my node app.For both of the Problem, I found that this was occuring because of not starting the wampserver.I am using mysql database in my app for getting the data with the help of wampserver. I resolve this by starting the wampserver and then after running my node app. It works fine.You can use node or nodemon for running the node application It's not the problem in my case.
Few options I tried and worked as a temporary solutions
If using node, try to switch between different node versions using node use #version#. Worked for me
Try switching internet connection
I'm using nodejs and a mongoDB - and I'm having some connection issues.
Well, actually "wake" issues! It connects perfectly well - is super fast and I'm generally happy with the results.
My problem: If i don't use the connection for a while (i say while, because the timeframe varies 5+ mins) it seems to stall. I don't get disconnection events fired - it just hangs.
Eventually i get a response like Error: failed to connect to [ * .mongolab.com: * ] - ( * = masked values)
A quick restart of the app, and the connection's great again. Sometimes, if i don't restart the app, i can refresh and it reconnects happily.
This is why i think it is "wake" issues.
Rough outline of code:
I've not included the code - I don't think it's needed. It works (apart from the connection dropout)
Things to note: There is just the one "connect" - i never close it. I never reopen.
I'm using mongoose, socketio.
/* constants */
var mongoConnect = 'myworkingconnectionstring-includingDBname';
/* includes */
/* settings */
/* Schema */
var db = mongoose.connect(mongoConnect);
/* Socketio */
io.configure(function (){
io.set('authorization', function (handshakeData, callback) {
});
});
io.sockets.on('connection', function (socket) {
});//sockets
io.sockets.on('disconnect', function(socket) {
console.log('socket disconnection')
});
/* The Routing */
app.post('/login', function(req, res){
});
app.get('/invited', function(req, res){
});
app.get('/', function(req, res){
});
app.get('/logout', function(req, res){
});
app.get('/error', function(req, res){
});
server.listen(port);
console.log('Listening on port '+port);
db.connection.on('error', function(err) {
console.log("DB connection Error: "+err);
});
db.connection.on('open', function() {
console.log("DB connected");
});
db.connection.on('close', function(str) {
console.log("DB disconnected: "+str);
});
I have tried various configs here, like opening and closing all the time - I believe though, the general consensus is to do as i am with one open wrapping the lot. ??
I have tried a connection tester, that keeps checking the status of the connection... even though this appears to say everthing's ok - the issue still happens.
I have had this issue from day one. I have always hosted the MongoDB with MongoLab.
The problem appears to be worse on localhost. But i still have the issue on Azure and now nodejit.su.
As it happens everywhere - it must be me, MongoDB, or mongolab.
Incidentally i have had a similar experience with the php driver too. (to confirm this is on nodejs though)
It would be great for some help - even if someone just says "this is normal"
thanks in advance
Rob
UPDATE: Our support article for this topic (essentially a copy of this post) has moved to our connection troubleshooting doc.
There is a known issue that the Azure IaaS network enforces an idle timeout of roughly thirteen minutes (empirically arrived at). We are working with Azure to see if we can't make things more user-friendly, but in the meantime others have had success by configuring their driver options to work around the issue.
Max connection idle time
The most effective workaround we've found in working with Azure and our customers has been to set the max connection idle time below four minutes. The idea is to make the driver recycle idle connections before the firewall forces the issue. For example, one customer, who is using the C# driver, set MongoDefaults.MaxConnectionIdleTime to one minute and it cleared up their issues.
MongoDefaults.MaxConnectionIdleTime = TimeSpan.FromMinutes(1);
The application code itself didn't change, but now behind the scenes the driver aggressively recycles idle connections. The result can be seen in the server logs as well: lots of connection churn during idle periods in the app.
There are more details on this approach in the related mongo-user thread, SocketException using C# driver on azure.
Keepalive
You can also work around the issue by making your connections less idle with some kind of keepalive. This is a little tricky to implement unless your driver supports it out of the box, usually by taking advantage of TCP Keepalive. If you need to roll your own, make sure to grab each idle connection from the pool every couple minutes and issue some simple and cheap command, probably a ping.
Handling disconnects
Disconnects can happen from time to time even without an aggressive firewall setup. Before you get into production you want to be sure to handle them correctly.
First, be sure to enable auto reconnect. How to do so varies from driver to driver, but when the driver detects that an operation failed because the connection was bad turning on auto reconnect tells the driver to attempt to reconnect.
But this doesn't completely solve the problem. You still have the issue of what to do with the failed operation that triggered the reconnect. Auto reconnect doesn't automatically retry failed operations. That would be dangerous, especially for writes. So usually an exception is thrown and the app is asked to handle it. Often retrying reads is a no-brainer. But retrying writes should be carefully considered.
The mongo shell session below demonstrates the issue. The mongo shell by default has auto reconnect enabled. I insert a document in a collection named stuff then find all the documents in that collection. I then set a timer for thirty minutes and tried the same find again. It failed, but the shell automatically reconnected and when I immediately retried my find it worked as expected.
% mongo ds012345.mongolab.com:12345/mydatabase -u *** -p ***
MongoDB shell version: 2.2.2
connecting to: ds012345.mongolab.com:12345/mydatabase
> db.stuff.insert({})
> db.stuff.find()
{ "_id" : ObjectId("50f9b77c27b2e67041fd2245") }
> db.stuff.find()
Fri Jan 18 13:29:28 Socket recv() errno:60 Operation timed out 192.168.1.111:12345
Fri Jan 18 13:29:28 SocketException: remote: 192.168.1.111:12345 error: 9001 socket exception [1] server [192.168.1.111:12345]
Fri Jan 18 13:29:28 DBClientCursor::init call() failed
Fri Jan 18 13:29:28 query failed : mydatabase.stuff {} to: ds012345.mongolab.com:12345
Error: error doing query: failed
Fri Jan 18 13:29:28 trying reconnect to ds012345.mongolab.com:12345
Fri Jan 18 13:29:28 reconnect ds012345.mongolab.com:12345 ok
> db.stuff.find()
{ "_id" : ObjectId("50f9b77c27b2e67041fd2245") }
We're here to help
Of course, if you have any questions please feel free to contact us at support#mongolab.com. We're here to help.
Thanks for all the help guys - I have managed to solve this issue on both localhost and deployed to a live server.
Here is my now working connect code:
var MONGO = {
username: "username",
password: "pa55W0rd!",
server: '******.mongolab.com',
port: '*****',
db: 'dbname',
connectionString: function(){
return 'mongodb://'+this.username+':'+this.password+'#'+this.server+':'+this.port+'/'+this.db;
},
options: {
server:{
auto_reconnect: true,
socketOptions:{
connectTimeoutMS:3600000,
keepAlive:3600000,
socketTimeoutMS:3600000
}
}
}
};
var db = mongoose.createConnection(MONGO.connectionString(), MONGO.options);
db.on('error', function(err) {
console.log("DB connection Error: "+err);
});
db.on('open', function() {
console.log("DB connected");
});
db.on('close', function(str) {
console.log("DB disconnected: "+str);
});
I think the biggest change was to use "createConnection" over "connect" - I had used this before, but maybe the options help now. This article helped a lot http://journal.michaelahlers.org/2012/12/building-with-nodejs-persistence.html
If I'm honest I'm not overly sure on why I have added those options - as mentioned by #jareed, i also found some people having success with "MaxConnectionIdleTime" - but as far as i can see the javascript driver doesn't have this option: this was my attempt at trying to replicate the behavior.
So far so good - hope this helps someone.
UPDATE: 18 April 2013 note, this is a second app with a different setup
Now I thought i had this solved but the problem rose it's ugly head again on another app recently - with the same connection code. Confused!!!
However the set up was slightly different…
This new app was running on a windows box using IISNode. I didn't see this as significant initially.
I read there were possibly some issues with mongo on Azure (#jareed), so I moved the DB to AWS - still the problem persisted.
So i started playing about with that options object again, reading up quite a lot on it. Came to this conclusion:
options: {
server:{
auto_reconnect: true,
poolSize: 10,
socketOptions:{
keepAlive: 1
}
},
db: {
numberOfRetries: 10,
retryMiliSeconds: 1000
}
}
That was a bit more educated that my original options object i state.
However - it's still no good.
Now, for some reason i had to get off that windows box (something to do with a module not compiling on it) - it was easier to move than spend another week trying to get it to work.
So i moved my app to nodejitsu. Low and behold my connection stayed alive! Woo!
So…. what does this mean… I have no idea! What i do know is is those options seem to work on Nodejitsu…. for me.
I believe IISNode uses some kind of "forever" script for keeping the app alive. Now to be fair the app doesn't crash for this to kick in, but i think there must be some kind of "app cycle" that is refreshed constantly - this is how it can do continuous deployment (ftp code up, no need to restart app) - maybe this is a factor; but i'm just guessing now.
Of course all this means now, is this isn't solved. It's still not solved. It's just solved for me in my setup.
A couple of recommendations for people still having this issue:
Make sure you are using the latest mongodb client for node.js. I noticed significant improvements in this area when migrating from v1.2.x to v1.3.10 (the latest as of today)
You can pass an options object to the MongoClient.connect. The following options worked for me when connecting from Azure to MongoLab:
options = {
db: {},
server: {
auto_reconnect: true,
socketOptions: {keepAlive: 1}
},
replSet: {},
mongos: {}
};
MongoClient.connect(dbUrl, options, function(err, dbConn) {
// your code
});
See this other answer in which I describe how to handle the 'close' event which seems to be more reliable. https://stackoverflow.com/a/20690008/446681
Enable the auto_reconnect Server option like this:
var db = mongoose.connect(mongoConnect, {server: {auto_reconnect: true}});
The connection you're opening here is actually a pool of 5 connections (by default) so you're right to just connect and leave it open. My guess is that you intermittently lose connectivity with mongolab and your connections die when that occurs. Hopefully, enabling auto_reconnect resolves that.
Increasing timeouts may help.
"socketTimeoutMS" : How long a send or receive on a socket can take
before timing out.
"wTimeoutMS" : It controls how many milliseconds the server waits for
the write concern to be satisfied.
"connectTimeoutMS" : How long a connection can take to be opened
before timing out in milliseconds.
$m = new MongoClient("mongodb://127.0.0.1:27017",
array("connect"=>TRUE, "connectTimeoutMS"=>10, "socketTimeoutMS"=>10,
"wTimeoutMS"=>10));
$db= $m->mydb;
$coll = $db->testData;
$coll->insert($paramArr);
I had a similar problem being disconnected from MongoDB periodically. Doing two things fixed it:
Make sure your computer never sleeps (that'll kill your network connection).
Bypass your router/firewall (or configure it properly, which I haven't figured out how to do yet).