I've been reading articles about methods/best practices to handle errors in node. Besides listening to process.on('uncaughtException'), which considered a bad practice, there is no method to properly handle exceptions troughout the entire application.
All the suggested solutions like using domains, try/catch blocks should be implemented per module (or worse if you use try/catch and not domains, per action).
Am i missing some article/documentation or domains/try-catch blocks are the best available solutions?
The best solution is to rely on asynchronous code.
Asynchronous code show not throw but rather pass the error to the Callback function as the first argument to be handled by the callee. The problem with a lot of modules is that people seem to throw whenever they think it is right to do so and this happens a lot more often than actual syntax exceptions, type exceptions or any exception is thrown by V8. If you don't want your app to crash in production use well-written modules that are truly async and rely on callbacks rather than try/catch and if you really need to rely on a sync module then wrap it around try/catch but there is almost always an async module out there that even though might not perform any I/O passes the error to the callback.
Moreover try/catch does have a performance overhead (albeit tiny) that would put strain on resources if overused (don't wrap every module in try/catch). From experience I suggest if you are developing something that goes into production then try to choose your modules carefully rather than catching their errors later on. process.on('uncaughtException') is a last resort and usually ideal for logging the exceptions.
As an ultimate solution I suggest that you use a process manager. My personal choice is PM2. It will attempt to gracefully restart your code and handles exceptions pretty well.
References:
Try/Catch performance overhead.
Try/Catch performance overhead (recent)
Using PM2 in production
Isaac Schlueter on async code and throw
Related
I have a simple node app with 1 function that defines 1000+ functions inside it (without running them).
When I call this function (the wrapper) around 200 times the RSS memory of the process spikes from 100MB to 1000MB and immediately goes down. (The memory spike only happens after around 200~ calls, before that all the calls do not cause a memory spike, and all the calls after do not cause a memory spike)
This issue is happening to us in our node server in production, and I was able to reproduce it in a simple node app here:
https://github.com/gileck/node-v8-memory-issue
When I use --jitless pr --no-opt the issue does not happen (no spikes). but obviously we do not want to remove all the v8 optimizations in production.
This issue must be some kind of a specific v8 optimization, I tried a few other v8 flags but non of them fix the issue (only --jitless and --no-opt fix it)
Anyone knows which v8 optimization could cause this?
Update:
We found that --no-concurrent-recompilation fix this issue (No memory spikes at all).
but still, we can't explain it.
We are not sure why it happens and which code changes might fix it (without the flag).
As one of the answers suggests, moving all the 1000+ function definitions out of the main function will solve it, but then those functions will not be able to access the context of the main function which is why they are defined inside it.
Imagine that you have a server and you want to handle a request.
Obviously, The request handler is going to run many times as the server gets a lot of requests from the client.
Would you define functions inside the request handler (so you can access the request context in those functions) or define them outside of the request handler and pass the request context as a parameter to all of them? We chose the first option... what do you think?
anyone knows which v8 optimization could cause this?
Load Elimination.
I guess it's fair to say that any optimization could cause lots of memory consumption in pathological cases (such as: a nearly 14 MB monster of a function as input, wow!), but Load Elimination is what causes it in this particular case.
You can see for yourself when your run with --turbo-stats (and optionally --turbo-filter=foo to zoom in on just that function).
You can disable Load Elimination if you feel that you must. A preferable approach would probably be to reorganize your code somewhat: defining 2,000 functions is totally fine, but the function defining all these other functions probably doesn't need to be run in a loop long enough until it gets optimized? You'll avoid not only this particular issue, but get better efficiency in general, if you define functions only once each.
There may or may not be room for improving Load Elimination in Turbofan to be more efficient for huge inputs; that's a longer investigation and I'm not sure it's worth it (compared to working on other things that likely show up more frequently in practice).
I do want to emphasize for any future readers of this that disabling optimization(s) is not generally a good rule of thumb for improving performance (or anything else), on the contrary; nor are any other "secret" flags needed to unlock "secret" performance: the default configuration is very carefully optimized to give you what's (usually) best. It's a very rare special case that a particular optimization pass interacts badly with a particular code pattern in an input function.
I have a NodeJS/Express web app, where TypeOrm is used for many database functions. To avoid callback hell I usually use async/await to call and wait for database actions from my endpoint methods.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait. Does this also apply to async/await? Do I have to use promises and callbacks to get decent multi-user performance?
Sync functions really BLOCK the event loop, until the io is complete. But in other hand async-await is just syntactic sugar to the Promise chaining. It is not actually synchronous, it just looks like it. That is the biggest difference you need to understand.
But, Some problems are introduces with async-await making the promises too sequential.
For example, two independent promises should be executed parallelly with Promise.all But when you use async-await you tend to do
await func1();
await func2();
So, logically you may make bottleneck yourself. But syntactically it has no problems like Sync ones whatsoever.
You can check out ES5 transpilation using Babel REPL, you may understand a little bit better.
The reason you shouldn't use *Sync functions is that they block the event loop. That causes your application to become irresponsive while synchronous functions are being executed.
Promises help you avoid the callback hell problem when you are dealing with async code. Additionally, async/await syntax allows you to write asynchronous code that LOOKS like synchronous code. So it's perfectly fine to use async and await.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait.
This is true, mostly.
Remember that your server doesn't run on a single pipeline. Rather, you use the cluster API to run multiple worker processes side-by-side (where the number of cores of the server's CPU limits the number of worker processes that should be run).
In reality then, even if a single event loop is blocked by a synchronous IO and other requests that are assigned to the very same loop are forced to wait, other worker processes are still able to process incoming requests.
On the other hand, if you make a request await an IO operation, the loop is able to pick up another request and process it. But, there's a trade off.
Suppose first request comes, you await a fs.readFile. The second request comes and it's processed. But the second request doesn't wait for any IO, rather, it's a CPU-bound operation (a heavy calculation maybe?). The IO operation triggered by the first request completes but it has to wait until the second request completes and only then the continuation can be picked up by the event loop and the response can be sent back.
Do I have to use promises and callbacks to get decent multi-user performance?
A simple answer would be yes, however, be careful and monitor your app to not to fall in a pitfall (e.g. mixing IO requests with CPU intensive tasks where the performance of async IO could be worse from the client's perspective).
Using async/await in Node.js syntax is preferable to the alternatives, which are stock promises or especially callbacks. It allows for much cleaner code which is easier to understand and maintain. We used to have to use babel to transpile to access these in older times but they've been in Node for ages now so I'd recommend for people to use them.
What is the best way to use NodeJS's require function? By this, I'm referring to the placement of the require statement. Isf it better to load all dependencies from the beginning of the script, as you need them, or does it not make a notable difference whatsoever?
This article has a lot useful information regarding how require works, though I still can't come to a definitive conclusion as to which method would be most efficient.
Assuming you're using node.js for some sort of server environment, several things are generally true about that server environment:
You want fast response time to any given request.
The code that runs for processing requests should not use synchronous I/O operations because that seriously lessens the scalability of the server.
Server startup time is generally not something you need to optimize for (within reason) so if you're going to pay an initialization cost somewhere, it is usually better paid once at server startup time.
So, given that require() uses synchronous I/O when the module has not yet been cached, that means you really don't generally want to be doing require() operations inside a request handler. And, you want fast response times for your request handlers so you don't want require() calls inside your handler anyway.
All of these leads to a general rule of thumb that you load necessary modules at startup time into a module level variable that you can reuse from one request to the next and you don't load modules inside your request handlers.
In addition to all of this, if you put all your require() statements in a block near the top of your module, it makes your module a lot more self-documenting about what other modules it depends on and how it initializes those modules. If require() statements are sprinkled all over the code, then it makes it a lot harder for a developer to see what this module is using without a lot more study of the code.
It depends what performance characteristics you're looking for.
require() is not cheap; it has to read the JS file from disk, parse it, and execute any top-level code (and do all of that recursively for all files require()d by that file).
If you put all of your require()s on top, your code may take more time to start, but it won't suddenly slow down later. (note that moving the require() further down in the synchronous top-level code will make no difference beyond order of execution).
If you only require() other modules when first used asynchronously, your first request may be noticeably slower, as Node parses all of your dependencies. This also means that any errors from dependencies won't be caught until later. (note that all require() calls are cached, so the second request won't do any more work)
The other disadvantage to scattering require() calls throughout your code is that it makes it less readable; it's very nice to easily see exactly what each file depends on up top.
Why and when should we prefer read/writeFileSync to the asynchronous ones on Nodejs in particular for server applications?
For them are blocking functions, you should ever prefer the async versions in production environments.
Anyway, it could make sense to use them during the bootstrap of your application, as an example if you use to load configurations from files and you don't want either to let your fully initialized components to interact with partially initialized ones or design them to be able to work with each other in such a case.
I cannot see any other meaningful use for them.
I prefer to use writeFileSync or readFileSync if it does not make any sense for the program to continue in the event of a failure and there is no other asynchronous work to do. Typically this is at the beginning/end of a program, but I have also used them while "checkpointing" a long running process to make it clear that nothing is changing while the checkpoint is being written to disk.
Of course you can implement this with the asynchronous versions too, it is just more verbose and prone to mistakes.
Redis is very fast. For most part on my machine it is as fast as say native Javascript statements or function calls in node.js. It is easy/painless to write regular Javascript code in node.js because no callbacks are needed. I don't see why it should not be that easy to get/set key/value data in Redis using node.js.
Assuming node.js and Redis are on the same machine, are there any npm libraries out there that allow interacting with Redis on node.js using blocking calls? I know this has to be a C/C++ library interfacing with V8.
I suppose you want to ensure all your redis insert operations have been performed. To achieve that, you can use the MULTI commands to insert keys or perform other operations.
The https://github.com/mranney/node_redis module queues up the commands pushed in multi object, and executes them accordingly.
That way you only require one callback, at the end of exec call.
This seems like a common bear-trap for developers who are trying to get used to Node's evented programming model.
What happens is this: you run into a situation where the async/callback pattern isn't a good fit, you figure what you need is some way of doing blocking code, you ask Google/StackExchange about blocking in Node, and all you get is admonishment on how bad blocking is.
They're right - blocking, ("wait for the result of this before doing anything else"), isn't something you should try to do in Node. But what I think is more helpful is to realize that 99.9% of the time, you're not really looking for a way to do blocking, you're just looking for a way to make your app, "wait for the result of this before going on to do that," which is not exactly the same thing.
Try looking into the idea of "flow control" in Node rather than "blocking" for some design patterns that could be a clearer fit for what you're trying to do. Here's a list of libraries to check out:
https://github.com/joyent/node/wiki/modules#wiki-async-flow
I'm new to Node too, but I'm really digging Async: https://github.com/caolan/async
Blocking code creates a MASSIVE bottleneck.
If you use blocking code your server will become INCREDIBLY slow.
Remember, node is single threaded. So any blocking code, will block node for every connected client.
Your own benchmarking shows it's fast enough for one client. Have you benchmarked it with a 1000 clients? If you try this you will see why blocking code is bad
Whilst Redis is quick it is not instantaneous ... this is why you must use a callback if you want to continue execution ensuring your values are there.
The only way I think you could (and am not suggesting you do) achieve this use a callback with a variable that is the predicate for leaving a timer.