Does Koa safely handle errors? - node.js

The Nodejs API states...
By the very nature of how throw works in JavaScript, there is almost
never any way to safely "pick up where you left off", without leaking
references, or creating some other sort of undefined brittle state.
However Koa traps errors and avoids exiting the nodejs process. What enables Koa to safely flout this advice?

There are primarily two cases where koa can not safely handle errors.
Thrown errors on different ticks:
app.use(function* () {
setImmediate(function () {
throw new Error('boom')
})
})
Emitter errors that are not set as response.body=:
app.use(function* () {
this.response.body = stream.pipe(zlib.createGzip())
})
Any function or library that does the first case is malformed and should not be used. If the function/library uses promises and/or callbacks correctly, it will never happen.
For emitters, simply always set each stream as the body (or use middleware):
app.use(function* () {
this.response.body = stream
this.response.body = this.response.body.pipe(zlib.createGzip())
})
Koa does this by allowing you to use try/catch on "async" stuff, specifically callbacks and promises. You can't, however, try/catch an error thrown on a different tick or an emitter.

Related

How to bubble up Exceptions from inside async/await functions in NodeJs?

Assume an Express route that makes a call to Mongoose and has to be async so it can await on the mongoose.find(). Also assume we are receiving XML but we have to change it to JSON, and that also needs to be async so I can call await inside of it.
If I do this:
app.post('/ams', async (req, res) => {
try {
xml2js.parseString(xml, async (err, json) => {
if (err) {
throw new XMLException();
}
// assume many more clauses here that can throw exceptions
res.status(200);
res.send("Data saved")
});
} catch(err) {
if (err instanceof XML2JSException) {
res.status(400);
message = "Malformed XML error: " + err;
res.send(message);
}
}
}
The server hangs forever. I'm assuming the async/await means that the server hits a timeout before something concludes.
If I put this:
res.status(200);
res.send("Data saved")
on the line before the catch(), then that is returned, but it is the only thing every returned. The client gets a 200, even if an XMLException is thrown.
I can see the XMLException throw in the console, but I cannot get a 400 to send back. I cannot get anything I that catch block to execute in a way that communicates the response to the client.
Is there a way to do this?
In a nutshell, there is no way to propagate an error from the xml2js.parseString() callback up to the higher code because that parent function has already exited and returned. This is how plain callbacks work with asynchronous code.
To understand the problem here, you have to follow the code flow for xml2js.parseString() in your function. If you instrumented it like this:
app.post('/ams', async (req, res) => {
try {
console.log("1");
xml2js.parseString(xml, async (err, json) => {
console.log("2");
if (err) {
throw new XMLException();
}
// assume many more clauses here that can throw exceptions
res.status(200);
res.send("Data saved")
});
console.log("3");
} catch (err) {
if (err instanceof XML2JSException) {
res.status(400);
message = "Malformed XML error: " + err;
res.send(message);
}
}
console.log("4");
});
Then, you would see this in the logs:
1 // about to call xml2js.parseString()
3 // after the call to xml2js.parseString()
4 // function about to exit
2 // callback called last after function returned
The outer function has finished and returned BEFORE your callback has been called. This is because xml2js.parseString() is asynchronous and non-blocking. That means that calling it just initiates the operation and then it immediately returns and the rest of your function continues to execute. It works in the background and some time later, it posts an event to the Javascript event queue and when the interpreter is done with whatever else it was doing, it will pick up that event and call the callback.
The callback will get called with an almost empty call stack. So, you can't use traditional try/catch exceptions with these plain, asynchronous callbacks. Instead, you must either handle the error inside the callback or call some function from within the callback to handle the error for you.
When you try to throw inside that plain, asynchronous callback, the exception just goes back into the event handler that triggered the completion of the asynchronous operation and no further because there's nothing else on the call stack. Your try/catch you show in your code cannot catch that exception. In fact, no other code can catch that exception - only code within the exception.
This is not a great way to write code, but nodejs survived with it for many years (by not using throw in these circumstances). However, this is why promises were invented and when used with the newer language features async/await, they provide a cleaner way to do things.
And, fortunately in this circumstance xml2js.parseString() has a promise interface already.
So, you can do this:
app.post('/ams', async (req, res) => {
try {
// get the xml data from somewhere
const json = await xml2js.parseString(xml);
// do something with json here
res.send("Data saved");
} catch (err) {
console.log(err);
res.status(400).send("Malformed XML error: " + err.message);
}
});
With the xml2js.parseString() interface, if you do NOT pass it a callback, it will return a promise instead that resolves to the final value or rejects with an error. This is not something all asynchronous interfaces can do, but is fairly common these days if the interface had the older style callback originally and then they want to now support promises. Newer interfaces are generally just built with only promise-based interfaces. Anyway, per the doc, this interface will return a promise if you don't pass a callback.
You can then use await with that promise that the function returns. If the promise resolves, the await will retrieve the resolved value of the promise. If the promise rejects, because you awaiting the rejection will be caught by the try/catch. FYI, you can also use .then() and .catch() with the promise, but in many cases, async and await are simpler so that's what I've shown here.
So, in this code, if there is invalid XML, then the promise that xml2js.parseString() returns will reject and control flow will go to the catch block where you can handle the error.
If you want to capture only the xml2js.parseString() error separately from other exceptions that could occur elsewhere in your code, you can put a try/catch around just it (though this code didn't show anything else that would likely throw an exception so I didn't add another try/catch). In fact, this form of try/catch can be used pretty much like you would normally use it with synchronous code. You can throw up to a higher level of try/catch too.
A few other notes, many people who first start programming with asynchronous operations try to just put await in front of anything asynchronous and hope that it solves their problem. await only does anything useful when you await a promise so your asynchronous function must return a promise that resolves/rejects when the asynchronous operation is complete for the await to do anything useful.
It is also possible to take a plain callback asynchronous function that does not have a promise interface and wrap a promise interface around it. You pretty much never want to mix promise interface functions with plain callback asynchronous operations because error handling and propagation is a nightmare with a mixed model. So, sometimes you have to "promisify" an older interface so you can use promises with it. In most cases, you can do that with util.promisify() built into the util library in nodejs. Fortunately, since promises and async/await are the modern and easier way to do asynchronous things, most newer asynchronous interfaces in the nodejs world come with promise interfaces already.
You are throwing exceptions inside the callback function. So you cant expect the catch block of the router to receive it.
One way to handle this is by using util.promisify.
try{
const util = require('util');
const parseString = util.promisify(xml2js.parseString);
let json = await parsestring(xml);
}catch(err)
{
...
}

Understanding difference in two async code snippets

I was trying to get my hands dirty on advanced NodeJS concepts by Stephen Grinder.
Trying to teach the mere basics of redis, Stephen did something like this
app.get('/api/blogs', requireLogin, async (req, res) => {
//This time we are setting
const redis = require('redis')
const redisURL = 'redis://127.0.0.1:6379';
const client = redis.createClient(redisURL);
const util = require('util')
client.get = util.promisify(client.get)
//We are checking if we have ever fetched any blogs related to the user with req.user.id
const cachedBlog = await client.get(req.user.id)
//if we have stored list of blogs, we will return those
if (cachedBlog) {
console.log(cachedBlog)
console.log("Serving from Cache")
return res.send(JSON.parse(cachedBlogs))
} //this is JSONIFIED as well so we need to convert it into list of arrays
console.log("serving from Mongoose")
//if no cache exsist
const blogs = await Blog.find({_user: req.user.id})
//blogs here is an object so we would need to stringfy it
res.send(blogs);
client.set(req.user.id, JSON.stringify(blogs))
})
And it works without any error but in last two lines, if we change the order
client.set(req.user.id, JSON.stringify(blogs))
res.send(blogs);
it does not display my blog.
Since inside the API, I am considering both of them to run asynchronously, I thought order won't matter.
Can anyone tell me what am I missing or unable to comprehend?
Since OP asks to understand the difference, not fix the code:
express runs the request handler function and catches synchronous errors (they become http500 errors). It doesn't do anything with the promise returned from the async function and doesn't use await internally, so you don't get error handling for async functions for free. All asynchronous errors need to be caught inside and passed to the next callback or handled in your code by sending an appropriate status code and error body.
When an error occurs, JS stops and doesn't execute any more lines in the function. So if an error is thrown from client.set placed before res.send, the line with send won't run and no response is sent. The browser should continue waiting for the response until timeout.
The other way around - you send response before the error, so you get the page, but the response doesn't end (I'd assume the connection remains open as if the backend was going to send more) but ever since early versions of Firefox browsers start rendering HTML as it's downloaded, so you see a page even though the browser is still waiting for the response to finish.
The order of these two lines doesn't matter but that res.send isn't called in case client.set goes first means that there's an error. If an error occurs in async function, this may result in UnhandledPromiseRejectionWarning warning that will be visible in console.
There are several problems with this snippet.
That the error occurs even when though client.set is asynchronous suggests that client.set causes synchronous error which wasn't caught.
client.set wasn't promisified but it should for correct control flow. That it wasn't provided with callback argument could be a reason why it caused an error.
As explained in this answer, Express doesn't support promises, all rejections should be explicitly handled for proper error handling.
All common code like require goes outside middleware function. It should be:
const redis = require('redis')
const redisURL = 'redis://127.0.0.1:6379';
const client = redis.createClient(redisURL);
const util = require('util')
client.get = util.promisify(client.get)
client.set = util.promisify(client.set)
app.get('/api/blogs', requireLogin, async (req, res, next) => {
try {
const cachedBlog = await client.get(req.user.id)
if (cachedBlog) {
return res.send(JSON.parse(cachedBlogs))
}
const blogs = await Blog.find({_user: req.user.id});
await client.set(req.user.id, JSON.stringify(blogs));
res.send(blogs);
} catch (err) {
next(err);
}
})
Most popular libraries have promise counterparts that allow to skip boilerplate promisification code, this applies to redis as well.
The two task will runs asynchronously but the order of execution matters.
client.set(req.user.id, JSON.stringify(blogs)) execution starts first, but as you are not using await, the promise will not be resolved but execution has already started.
After that res.send() will execute.
You are not getting the response implies that there is some error in the execution of client.set(req.user.id, JSON.stringify(blogs)).
Use Try catch block to trace this error (as mentioned in other answers).
You can also add these lines in your code to catch other "unhandledRejection" or "uncaughtException" error (if any).
process.on('unhandledRejection', (err) => {
logger.error('An unhandledRejection error occurred!');
logger.error(err.stack)
});
process.on('uncaughtException', function (err) {
logger.error('An uncaught error occurred!');
logger.error(err.stack);
});

Node.js Uncaught Exception - Passing more details

First of all, I know one way to catch uncaught errors is use process.on('uncaughtException', err, function() {});.
I want to know how I can pass more details, or more context to the error. I want to be able to get the variables used. I'm not trying to recover from the error, only get more details of my environment when the error happened before shutting down the process. Yeah, the stack trace is nice, but I'd like to know how to replicate the error.
For example, I have this function:
function doTheBatman(var1) {
var var2 = 'whatever';
// this causes an uncaught exception later in the code
}
On process.on, I want to be able to access var1 and var2.
process.on('uncaughtException', function(err) {
// process.whatever doesn't contain any active variables
});
A synchronous exception in an Express route handler will be caught by Express itself and the exception will be passed off the default Express error handler where you can catch it yourself and the exception context is passed to that default express error handler.
You can see this code inside of Express where a route handler gets called:
Layer.prototype.handle_request = function handle(req, res, next) {
var fn = this.handle;
if (fn.length > 3) {
// not a standard request handler
return next();
}
try {
fn(req, res, next);
} catch (err) {
next(err);
}
};
The call to next(err) will pass the exception object off to the default Express error handler (which you can install a handler for).
If your code is throwing an exception inside of an asynchronous callback, then that is more complicated to catch in action. If you're using regular async callbacks (not promises), then the only way I know of to catch those at a meaningful spot is to put a try/catch inside of every async callback so you can capture the local stack info.
If you use promises at the lowest level and only program your logic and asynchronous code flow use promise functionality, then an exception in a promise handler will automatically turn into a rejected promise and certain promise libraries (like the Bluebird library can be configured to give you a pretty full stack trace of where things went wrong). Promises have this advantage because every promise .then() or .catch() handler is automatically wrapped in a try/catch handler and those are turned into promise rejections which propagate up the chain to the first .catch() handler they find.
For the non-Express example you just added to your question, you will just have to put a try/catch somehwere in the local neighborhood to catch a local synchronous exception:
function doTheBatman(var1) {
try {
var var2 = 'whatever';
// this causes an uncaught exception later in the code
} catch(e) {
console.log(e);
}
}
You can even set a debugger breakpoint in the catch handler and then examine variables in that scope when it is hit. I don't know of any way to examine the local variables at the point of the exception from the uncaughtException handler. err.stack should give you the stack trace, but not variable values.
Express/frameworks may offer more elegant/robust solutions, but if you're truly after what your question asks for... why not just capture variables outside of the functions scope? This is typically nasty and not considered best practice, but if you have a function that you know could be susceptive to failing often, perhaps the quick and dirty solution is what you need. You could always refine this later, but hopefully this demonstrates the approach...
var transactionVars = {};
function clearTransaction() {
transactionVars = {};
}
function doTheBatman(var1) {
transactionVars['var2'] = 'whatever';
// [...] bunch of stuff, possibly blowing up...
clearTransaction(); // we made it this far? cool, reset...
}
process.on('uncaughtException', function(err) {
console.log(transactionVars['var2']); // whatever
});
Furthermore, if you want to really dirty (in case these two functions are in two files) you can always tack transactionVars on the global object.
This is essentially a poor mans event emitter pattern, which I would highly recommend refactoring into once you grasp the general flow of how this works...

Handling simple value rejects in a promise library

When the Bluebird module detects a non-Error reject, it reports Warning: a promise was rejected with a non-error.
So when we are writing a reusable promise library, how should we handle the situation when it is the client's code that throws simple values?
Simple my-library example:
var Promise = require('bluebird');
module.exports = function (cb) {
return new Promise(function (resolve, reject) {
try {
cb();
resolve('some data');
} catch (e) {
// should we do this?
reject(e);
// or should we do this?
// reject(e instanceof Error ? e : new Error(e));
}
});
};
A client that consumes it:
var lib = require('my-library');
lib(function () {
throw 'Regular text';
})
.catch(function (error) {
// - should we expect exactly 'Regular text' here?
// - or should it be wrapped into Error by the promise library?
});
In the example above we are going to see the same Warning: a promise was rejected with a non-error again, which in this case is obviously the client's fault, but the problem is - it will be reported against the promise library that did the reject.
This brings a question of how a reusable library should handle such cases.
Option 1: do nothing (just use reject(e)), let the client see the warning message Warning: a promise was rejected with a non-error.
Option 2: verify if the error thrown is an instance of Error and if not - do reject(new Error(e)). This enables the client to throw and reject with simple values, which is probably bad.
I am looking for a general recommendation here.
Also consider that sometimes the client can be an old library that you are trying to wrap into promises.
You should definitely and objectively wrap non-errors with errors.
Non errors are terrible for debugging, the bluebird warning is there because it is very hard to figure out where an error came from if it's a primitive. You can't attach properties to primitives and they have no stack traces.
Stack traces mean you don't have to guess where the errors come from - you know.
That said, bluebird already does this automatically with promisify and promisifyAll and consider using them. If you're doing this manually - just make sure you throw or reject things with stacks too.
Node in addition looks for the stack traces of your unhandled rejections - so rejecting with junk will not only give you useless information in Node but it might actually crash your program.
Error messages should be wrapped in error objects, but not automatically.
Automatic wrapping reduces the usefulness of stack traces. For example, lets say the client function is f and your library function is g:
function f(x) { throw 'Hello world' }
function g(f, x) { try { f(x); } catch (e) { throw new Error(e); } }
g(f, 1)
The client function and line numbers where the error originated will be completely missing from the stack:
Error: Hello world
at g (repl:1:52)
at repl:1:1
While if the client function creates the error object:
function f(x) { throw new Error('Hello world'); }
function g(f, x) { try{ f(x); } catch (e) { throw e; } }
g(f, 1)
we get a full stack trace:
Error: Hello world
at f (repl:1:23)
at g (repl:1:25)
at repl:1:1
There is no way to recover the stack up to the point in your library, making it useless for users (they wont be able to debug their own code).
That said, this largely depends on what sort of library you're making and what sort of contract you would like it to present to the users. You should think very carefully before using user-passed callbacks, as they invert the normal chain of responsibility in the stack.
Promises make this easier because they un-invert the inversion: the library is called first to generate a promise, then that promise is passed a callback. As such, creating a new error together with the creation of the promise allows Bluebird to at least partially look into / augment the stack of the client code (the line that created the promise will be present in the stack). The warning is still there because its impossible to provide a complete stack.
No, they simply want you to wrap your JSON error objects into something.
This is not freedom, it's dictature.
There's no point in wrapping simple JSON objects in bulky Error wrappers.
Don't listen to them.

Node.js: serial operations with branching

I'm new to node.js and using it for a backend that takes data from syslog messages and stores it to a database.
I've run into the following type of serial operations:
1. Query the database
2. If the query value is X do A. otherwise do B.
A. 1. Store "this" in DB
2. Query the database again
3. If the query value is Y do P. otherwise do Q.
P. Store "something"
Q. Store "something else"
B. 1. Store "that" in the DB
2. Store "the other thing" in the DB
The gist here is that I have some operations that need to happen in order, but there is branching logic in the order.
I end up in callback hell (I didn't know what that is when I came to Node.. I do now).
I have used the async library for things that are more straight forward - like doing things in order with async.forEachOfSeries or async.queue. But I don't think there's a way to use that if things have to happen in order but there is branching.
Is there a way to handle this that doesn't lead to callback hell?
Any sort of complicated logic like this is really, really going to benefit from using promises for every async operation, especially when you get to handling errors, but also just for structuring the logic flow.
Since you haven't provided any actual code, I will make up an example. Suppose you have two core async operations that both return a promise: query(...) and store(...).
Then, you could implement your above logic like this:
query(...).then(function(value) {
if (value === X) {
return store(value).then(function() {
return query(...).then(function(newValue) {
if (newValue === Y) {
return store("something");
} else {
return store("something else");
}
})
});
} else {
return store("that").then(function() {
return store("the other thing");
});
}
}).then(function() {
// everything succeeded here
}, function(err) {
// error occurred in anyone of the async operations
});
I won't pretend this is simple. Implemented logic flow among seven different async operation is just going to be a bit of code no matter how you do it. But, promises can make it less painful than otherwise and massively easier to make robust error handling work and to interface this with other async operations.
The main keys of promises used here are:
Make all your async operations returning promises that resolve with the value or reject with the failure as the reason.
Then, you can attach a .then() handler to any promise to see when the async operation is finished successfully or with error.
The first callback to .then() is the success handler, the second callback is the error handler.
If you return a promise from within a .then() handler, then that promise gets "chained" to the previous one so all success and error state becomes linked and gets returned back to the original promise state. That's why you see the above code always returning the nested promises. This automates the returning of errors all the way back to the caller and allows you to return a value all the way back to the caller, even from deeply nested promises.
If any promise in the above code rejects, then it will stop that chain of promises up until an actual reject handler is found and propagate that error back to that reject handler. Since the only reject handler in the above code is at the to level, then all errors can be caught and seen there, no matter how deep into the nested async mess it occurred (try doing that reliably with plain callbacks - it's quite difficult).
Native Promises
Node.js has promises built in now so you can use native promises for this. But, the promise specification that node.js implements is not particularly feature rich so many folks use a promise library (I use Bluebird for all my node.js development) to get additional features. One of the more prominent features is the ability to "promisify" an existing non-promise API. This allows you to take a function that works only with a callback and create a new function that works via a promise. I find this particularly useful since many APIs that have been around awhile do not natively return promises.
Promisifying a Non-Promise Interface
Suppose you had an async operation that used a traditional callback and you wanted to "promisify it". Here's an example of how you could do that manually. Suppose you had db.query(whatToSearchFor, callback). You can manually promisify it like this:
function queryAsync(whatToSearchFor) {
return new Promise(function(resolve, reject) {
db.query(whatToSearchFor, function(err, data) {
if (!err) {
reject(err);
} else {
resolve(data);
}
});
});
}
Then, you can just call queryAsync(whatToSearchFor) and use the returned promise.
queryAsync("foo").then(function(data) {
// data here
}, function(err) {
// error here
});
Or, if you use something like the Bluebird promise library, it has a single function for promisifying any async function that communicates its result via a node.js-style callback passed as the last argument:
var queryAsync = Promise.promisify(db.query, db);

Resources