I am using node.js with domains and cluster to catch unexpected exceptions (programmer bugs) then restart gracefully. However we occasionally have programmers failing to add the final .catch().finally() to make sure that their code actually returns.
I can easily add a timeout that will throw an exception after a pre-specified time to make sure that these bad requests will not live forever. But what I'd like to do is to have the timeout pull information out of the domain to explain what had happened in the request so that we can have a log/notification/whatever that starts with a good indication of where the programmer bug was.
Is there any reasonable way to do that?
In case it matters, we are using express as a framework, express-domain-middleware to get domains/restart logic, and promises for async logic.
You don't actually need domains for this. If you use a promise library (like bluebird, or when, or even Q) or a recent version of Node (Namely, io.js) you don't have to remember using .catch on all chains and use the dedicated events:
process.on("unhandledRejection", function(e, reason){
// promise was rejected, even if no `catch` or `finally` attached
// restart the process
});
Note that domains are deprecated and will likely be removed in a future version of NodeJS, if you're using promises you already have catch safety so there's that.
Related
My Scenario
I'm trying to utilize the global scope of Node.js to initialize a database connection once, and use the initialized connection when the lambda function is invoked.
This can save a lot of resources and time, as opening a DB connection is a lengthy process:
// Global scope: Runs only once
const redis = require('redis');
const client = redis.createClient({ <HOST>, <PORT> });
// Function scope: runs per invocation
exports.handler = (event, context, callback) => {
do-something-with-redis
};
My Problem
Some common connection errors may occur
Uninitialized connection: Since Node.js is asynchronous, the function may start executing code before redis.create returns, hence using an uninitialized connection.
Timeout: If the connection attempt times out for some reason, the function will have an erroneous handler.
Runtime error: If a connection error happens during code execution, following invocation will have an erroneous handler.
My Question
What's the proper way to overcome errors (initialization, timeout and runtime) of a global Redis connection used by an AWS Lambda function?
Lambda functions were designed to be stateless, so I don't know if there is one best answer to this. There's a really helpful GitHub comment about Lambda and RDS, but it mostly applies. It mentions that the answer depends on how many requests it'll be making.
Regardless, this SO answer is more or less how I would do it; though I prefer a Promise-based API for the Redis library. The author handles the Uninitialized Connection issue by using callbacks to wait until the connection is opened before trying to use the connection. The other two issues you raise are also handled in that SO answer. Basically: if (err) callback(err).
I mean, given the GitHub comment message is from support at AWS, you need to make a connection inside the handler, so you may as well only do it there until you're sure you need the perf boost.
I realize this doesn't exactly answer the question, but the question has been open for a few days now and I'm curious. And there's nothing like being wrong on the internet to find out the right answer...
The only way I have found to "catch" EPIPE errors thrown asynchronously by a socket timing out or closing prematurely is to directly attach an event handler to the socket object itself, as demonstrated in the documentation here:
https://nodejs.org/api/errors.html
const net = require('net');
const connection = net.connect('localhost');
// Adding an 'error' event handler to a stream:
connection.on('error', (err) => {
// If the connection is reset by the server, or if it can't
// connect at all, or on any sort of error encountered by
// the connection, the error will be sent here.
console.error(err);
});
This works, but is in many cases unhelpful -- if you're accessing a database or another service that has a node driver, the request and socket objects are likely inaccessible from your app code.
The most obvious solution is "don't do things that generate these errors" but since any non-trivial application is dependent on other services, no amount of input-checking in advance can guarantee that the service on the other end won't hang up unexpectedly, throwing an EPIPE in your code and in all likelihood crashing Node.
So, the options for handling this situation seem to be:
Let the error crash your app and use nodemon or supervisor to automatically restart. This isn't clean, but it seems like the only way to really guarantee you'll get back up and running safely.
Write custom connection clients for dependent services. This let's you attach error handlers where known problems could occur. But it violates DRY and means that you're now on the hook for maintaining your own custom client code when otherwise reasonable open source solutions already exist. Basically, it adds a huge maintenance burden for a slightly cleaner solution to a fairly rare problem.
Am I missing something, or are those the best options available?
Using node.js, when I run the program
setTimeout(() => console.log("Timed out"), 0);
console.log("finishing");
I see
finishing
Timed out
But when I add a throw before "finishing"
setTimeout(() => console.log("Timed out"), 0);
throw new Error();
console.log("finishing");
I see
throw new Error();
^
Error
at Object.<anonymous> ...(stack trace here)...
And I don't see any mention of "Timed out".
Why is that? Even though the initial context would throw, once the stack was freed up, I expected the callback I passed to setTimeout would still run.
Does having an uncaught exception cause all timeouts to get canceled? Is this feature documented somewhere?
If I have multiple timeouts, is there a way for me to make sure that all the other timeouts continue to run when they can even if one of them happens to throw?
Unlike a web application running on browser, a Node application runs as a process on top of Google V8 JavaScript Engine. If you look into https://nodejs.org/api/timers.html is states that
The timer functions within Node.js implement a similar API as the timers API provided by Web Browsers but use a different internal implementation that is built around the Node.js Event Loop.
As the above statement explains, even though the same global functions are available in both cases, their implementations are different. Therefore when an uncaught exception occurs in a Node application, all code related to timeouts will stop as the process is terminated. The best way to handle this is to properly handle all exceptions. You can use the below code to capture all uncaught exceptions from the process level itself.
process.on('uncaughtException', function(error) {
console.log(error);
});
I try my very best to ensure that there are no errors in my code, but occasionally there is an uncaught exception that comes along and kills my app.
I could do with it not killing the app, but instead output it to a file somewhere, and try to resume the app where it left off - or restart quietly and show a nice message to all users on the application that something has gone wrong and to give it a sec while it sorts itself out.
In the event of the app not running, it'd be good if it could redirect it to somewhere that says "The app isn't running, get in touch to let me know" or something like that.
I could use process.on('uncaughtException') ... - but is this the right thing to do?
Thank you very much for taking the time to read this, and I appreciate your help and thoughts on this matter.
You can't actually resume after a crash, not at least without code written specifically for that purpose, like defining state and everything.
Otherwise use clusters to restart the app.
// ... your code ...
var cluster = require('cluster');
process.on('uncaughtException', function(err){
//.. do with `err` as you please
cluster.fork(); // start another instance of the app
});
When it forks, how does it affect the users - do they experience any latency while it's switching?
Clusters are usually used to keep running more than a single copy of your node app at all times, so that while one of the workers respawns, others are still active and preventing any latency.
if (cluster.isMaster)
require('os').cpus().forEach(cluster.fork);
cluster.on('exit', cluster.fork);
Is there anything that I should look out for, e.g. say there was an error connecting to the database and I hadn't put in a handler to deal with that, so the app kept on crashing - would it just keep trying to fork and hog all the system resources?
I've actually not thought about that concern before now. Sounds like a good concern.
Usually the errors are user instigated so it's not expected to cause such an issue.
Maybe database not connecting issue, and other such unrecoverable errors should be handled before the code actually goes into creating the forks.
mongoose.connection.on('open', function() {
// create forks here
});
mongoose.connection.on('error', function() {
// don't start the app if database isn't working..
});
Or maybe such errors should be identified and forks shouldn't be created. But you'll probably have to know in advance which errors could those be, so you could handle them.
I am working with a partner on a project. He has written a lot of code in Node.js+Express, but we've been running into issues with the architecture.
To remedy this, my primary role has been to figure out the best way to architect a Node.js+Express application. I've run into two scenarios, dealing with errors, and I'd like some suggestions.
First, how do I capture top-level exceptions? The last thing I want is for a bug to completely kill the node process. I want to continue serving users in the face of any error.
Secondly, some errors are passed back via callbacks (we're using caolan / async). As part of each route handler, either we render a view (GET), redirect to another route (POST) and we want to redirect to an error screen with a custom error message. How can I make sure to capture this logic in one place?
First, how do I capture top-level exceptions? The last thing I want is for a bug to completely kill the node process. I want to continue serving users in the face of any error.
Edit: I think node's philosophy in general is that any uncaught exceptions should kill the process, and that you should run your node app under some kind of process monitor with appropriate logging facilities. The following advice is regarding any other errors you might encounter in your express route handlers etc.
Express has a general errorHandler, which should capture all thrown errors as well as everything passed as a parameter to next in your routes/middlewares, and respond with 500 Internal Server Error.
Secondly, some errors are passed back via callbacks (we're using caolan / async). As part of each route handler, either we render a view (GET), redirect to another route (POST) and we want to redirect to an error screen with a custom error message. How can I make sure to capture this logic in one place?
You could create a custom handleError, which you call in each callback like so:
async.series(..., function(err, results) {
if(err)
return handleError(req, res, err);
// ...
});
Or you could just pass the errors on with next(err) and implement your custom error handler as described here: http://expressjs.com/guide/error-handling.html
Top level exceptions:
You can use the uncaughtException event from process, but it's generally not recommended.
Often applications will go into a corrupted state (eg. you have some state which typically gets set, but the exception caused that not to happen) when an exception is thrown. Then, it will just cause more and more errors from there on onwards.
A recommended approach is to use something like forever to automatically restart the app in case it crashes. This way you will have the application in a sane state even after a crash.
Error handling in express:
You can create a new Error instance and pass it to the next callback in the chain.
Eg.
express.get('/some/url', function(req, res, next) {
//something here
if(error) {
next(new Error('blah blah'));
}
});
To handle the error from here on onwards, you can set an error handler. See express docs on error handling
Checkout the excellent log-handling module Winston: https://github.com/flatiron/winston
It allows you to configure exception handling in a manner that will not only log it, but will allow the process to continue. And, since these would obviously be serious issues, you can even configure Winston to send out emails on specific event types (like exceptions).