Writing async javascript functions - node.js

I'm fairly familiar with nodejs now, but I have never tried to build a module before. I was curious to a bit abut async functions.
If you are writing a function that just returns a value, if it worth it to make it async for example, should this be written async?:
exports.getFilename = function () {
return filename;
}
Next, when writing a async function, is writing a function with a callback enough for performance, or is it recommended to thread it using a threading library as well?
Sorry for the somewhat obvious question, I noramlly am the one calling these functions

Callbacks and asynchronousness are two separate things though they are related since in javascript callbacks is the only mechanism to allow you to manage control flow in asynchronous code.
Weather or not non-asynchronous functions should accept callbacks depend on what the function does. One example of a type of function that is not asynchronous but is useful to supply a callback is iteration functions. Array.each() is a good example. Javascript doesn't allow you to pass code blocks so you pass functions to your iteration function.
Another example is filter functions that modify incoming data and return the modified version. Array.sort() is a good example. Passing a function to it allows you to apply your own conditions for how the array should be sorted.
Actually, filtering functions have a stronger reason for accepting functions/callbacks since it alters the behavior of the algorithm. Iteration functions are just nice syntactic sugar around for loops and are therefore a bit redundant. Though, they do make code nicer to read.
Weather or not a function should be asynchronous is a different matter. If it does something that takes a long time to compute (like I/O operations or large matrix calculations) then it should be made asynchronous. How long is "long" depends on your own tolerance. Generally for a moderately busy website a request shouldn't take more than 100ms to complete (in other words, you should be able to handle 10 hits per second at minimum). If an operation takes longer than that then you should split it up and make it async otherwise you'll risk making the site unresponsive to other users. For really busy websites you shouldn't tolerate operations that take longer than 10ms.
From the above explanation it should be obvious that just accepting a function or callback as an argument does not make a function asynchronous. The simplest pure-js way to make something async is to use setTimeout to break long calculations. Of course, the operation still happens in the same thread as the main Node process but at least it doesn't block other requests. To utilize multi-core CPUs on your servers you can use one of the threading libraries on NPM or clusters to make your function async.

Related

Is it a bad idea to use async/await in a Node/Express server?

I have a NodeJS/Express web app, where TypeOrm is used for many database functions. To avoid callback hell I usually use async/await to call and wait for database actions from my endpoint methods.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait. Does this also apply to async/await? Do I have to use promises and callbacks to get decent multi-user performance?
Sync functions really BLOCK the event loop, until the io is complete. But in other hand async-await is just syntactic sugar to the Promise chaining. It is not actually synchronous, it just looks like it. That is the biggest difference you need to understand.
But, Some problems are introduces with async-await making the promises too sequential.
For example, two independent promises should be executed parallelly with Promise.all But when you use async-await you tend to do
await func1();
await func2();
So, logically you may make bottleneck yourself. But syntactically it has no problems like Sync ones whatsoever.
You can check out ES5 transpilation using Babel REPL, you may understand a little bit better.
The reason you shouldn't use *Sync functions is that they block the event loop. That causes your application to become irresponsive while synchronous functions are being executed.
Promises help you avoid the callback hell problem when you are dealing with async code. Additionally, async/await syntax allows you to write asynchronous code that LOOKS like synchronous code. So it's perfectly fine to use async and await.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait.
This is true, mostly.
Remember that your server doesn't run on a single pipeline. Rather, you use the cluster API to run multiple worker processes side-by-side (where the number of cores of the server's CPU limits the number of worker processes that should be run).
In reality then, even if a single event loop is blocked by a synchronous IO and other requests that are assigned to the very same loop are forced to wait, other worker processes are still able to process incoming requests.
On the other hand, if you make a request await an IO operation, the loop is able to pick up another request and process it. But, there's a trade off.
Suppose first request comes, you await a fs.readFile. The second request comes and it's processed. But the second request doesn't wait for any IO, rather, it's a CPU-bound operation (a heavy calculation maybe?). The IO operation triggered by the first request completes but it has to wait until the second request completes and only then the continuation can be picked up by the event loop and the response can be sent back.
Do I have to use promises and callbacks to get decent multi-user performance?
A simple answer would be yes, however, be careful and monitor your app to not to fall in a pitfall (e.g. mixing IO requests with CPU intensive tasks where the performance of async IO could be worse from the client's perspective).
Using async/await in Node.js syntax is preferable to the alternatives, which are stock promises or especially callbacks. It allows for much cleaner code which is easier to understand and maintain. We used to have to use babel to transpile to access these in older times but they've been in Node for ages now so I'd recommend for people to use them.

Treat async code as threads in Node.js?

Kind of a weird question, Imagine you have a situation where you need to run 10 SYNCRONOUS functions, it doesn't matter when they complete, you just want to know when all 10 are done: I.E.
f1()
f2()
f3()
...
f10()
doStuffWithResult();
Now, If you use promises like so, assuming you have rewrote each as promoises:
Promise.All([f1,f2,f3,f4,f5,f6,f7,f8,f9,f10])
.then(() => {
doStuffWithResult();
})
Would you see a performance increase? Theoretically, I want to say no because these functions are still synchronous, and everything is still running on one thread.
Thanks!
Would you see a performance increase?
No, what you are proposing would not be faster.
Promises do not create threads. All they do is provide a cooperative system for keeping track of when asynchronous operations are complete and then notifying interested parties of success or failure. They also provide services for propagating errors when asynchronous operations are nested.
And, your proposed Promise.all() code would not even work. You must pass an array of promises to Promise.all(), not an array of function references. In your example, your functions would not even be called.
And, if you changed your code to something that would actually execute, then it would likely be slower than just calling the synchronous functions directly because you'd be executing promise code and all .then() handlers execute on a future tick (not synchronously).
In node.js, the only way to execute synchronous things in parallel is to launch child processes (that can execute in parallel) or pass the operations to some native code that can use actual OS threads.

Does Go have callback concept?

I found many talks saying that Node.js is bad because of callback hell and Go is good because of its synchronous model.
What I feel is Go can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Go and Node.js in callback perspective as if Go cannot become callback hell.
Or I misunderstand the meaning of callback and anonymous function in Go?
A lot of things take time, e.g. waiting on a network socket, a file system read, a system call, etc. Therefore, a lot of languages, or more precisely their standard library, include asynchronous version of their functions (often in addition to the synchronous version), so that your program is able to do something else in the mean-time.
In node.js things are even more extreme. They use a single-threaded event loop and therefore need to ensure that your program never blocks. They have a very well written standard library that is built around the concept of being asynchronous and they use callbacks in order to notify you when something is ready. The code basically looks like this:
doSomething1(arg1, arg2, function() {
doSomething2(arg1, arg2, function() {
doSomething3(function() {
// done
});
});
});
somethingElse();
doSomething1 might take a long time to execute (because it needs to read from the network for example), but your program can still execute somethingElse in the mean time. After doSomething1 has been executed, you want to call doSomething2 and doSomething3.
Go on the other hand is based around the concept of goroutines and channels (google for "Communicating Sequential Processes", if you want to learn more about the abstract concept). Goroutines are very cheap (you can have several thousands of them running at the same time) and therefore you can use them everywhere. The same code might look like this in Go:
go func() {
doSomething1(arg1, arg2)
doSomething2(arg1, arg2)
doSomething3()
// done
}()
somethingElse()
Whereas node.js focus on providing only asynchronous APIs, Go usually encourages you to write only synchronous APIs (without callbacks or channels). The call to doSomething1 will block the current goroutine and doSomething2 will only be executed after doSomething1 has finished. But that's not a problem in Go, since there are usually other goroutines available that can be scheduled to run on the system thread. In this case, somethingElse is part of another goroutine and can be executed in the meantime, just like in the node.js example.
I personally prefer the Go code, since it's easier to read and reason about. Another advantage of Go is that it also works well with computation heavy tasks. If you start a heavy computation in node.js that doesn't need to wait for network of filesystem calls, this computation basically blocks your event loop. Go's scheduler on the other hand will do its best to dispatch the goroutines on a few number of system threads and the OS might run those threads in parallel if your CPU supports it.
What I feel is Golang can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Golang and Node.js in callback perspective as if Golang cannot become callback hell.
Yes, of course it is possible to mess things up in Go as well. The reason why you don't see as much callbacks as in node.js is that Go has channels for communication, which allow for a way of structuring your code without using callbacks.
So, since there are channels, callbacks are not used as often therefore it is unlikely to stumble over callback infested code. Of course this doesn't mean that you cannot write scary code with channels as well...

Why is meteor.js synchronous?

Doesn't code take an efficiency hit by being synchronous? Why is coding synchronously a win?
I found these two links in doing some research: http://bjouhier.wordpress.com/2012/03/11/fibers-and-threads-in-node-js-what-for/, https://github.com/Sage/streamlinejs/
If the goal is to prevent spaghetti code, then clearly you can have asynchronous code, with streamline.js for example, that isn't a callback pyramid, right?
You have to distinguish two things here:
Synchronous functions like node's fs.readFileSync, fs.statSync, etc. All these functions have a Sync in their names (*). These functions are truly synchronous and blocking. If you call them, you block the event loop and you kill node's performance. You should only use these functions in your server's initialization script (or in command-line scripts).
Libraries and tools like fibers or streamline.js. These solutions allow you to write your code in sync-style but the code that you write with them will still execute asynchronously. They do not block the event loop.
(*) require is also blocking.
Meteor uses fibers. Its code is written in sync-style but it is non-blocking.
The win is not on the performance side (these solutions have their own overhead so they may be marginally slower but they can also do better than raw callbacks on specific code patterns like caching). The win, and the reason why these solutions have been developed, is on the usability side: they let you write your code in sync-style, even if you are calling asynchronous functions.
Jan 25 2017 edit: I created 3 gists to illustrate non-blocking fibers:
fibers-does-not-block.js, fibers-sleep-sequential.js, fibers-sleep-parallel.js
The code is not "synchronous" when using something like streamlinejs. The actual code will still run asynchronously. It's not very pretty to write lots of anonymous callback functions, thats where these things helps.

How do I make a non-IO operation synchronous vs. asynchronous in node.js?

I know the title sounds like a dupe of a dozen other questions, and it may well be. However, I've read those dozen questions, and Googled around for awhile, and found nothing that answers these questions to my satisfaction.
This might be because nobody has answered it properly, in which case you should vote me up.
This might be because I'm dumb and didn't understand the other answers (much more likely), in which case you should vote me down.
Context:
I know that IO operations in Node.js are detected and made to run asynchronously by default. My question is about non-IO operations that still might block/run for a long time.
Say I have a function blockingfunction with a for loop that does addition or whatnot (pure CPU cycles, no IO), and a lot of it. It takes a minute or more to run.
Say I want this function to run whenever someone makes a certain request to my server.
Question:
Obviously, if I explicitly invoke this loop at the outer level in my code, everything will block until it completes.
Most suggestions I've read suggest pushing it off into the future by starting all of my other handlers/servers etc. first, and deferring invocation of the function via process.nextTick or setTimeout(blockingfunction, 0).
But won't blockingfunction1 then just block on the next spin around the execution loop? I may be wrong, but it seems like doing that would start all of my other stuff without blocking the app, but then the first time someone made the request that results in blockingfunction being called, everything would block for as long as it took to complete.
Does putting blockingfunction inside a setTimeout or process.nextTick call somehow make it coexist with future operations without blocking them?
If not, is there a way to make blockingfunction do that without rewriting it?
How do others handle this problem? A lot of the answers I've seen are to the tune of "just trust your CPU-intensive things to be fast, they will be", but this doesn't satisfy.
Absent threading (where I can be guaranteed that the execution of blockingfunction will be interleaved with the execution of whatever else is going on), should I re-write CPU-intensive/time consuming loops to use process.nextTick to perform a fixed, guaranteed-fast number of iterations per tick?
Yes, you are correct. If you defer your function until the next tick, it will just block in that tick rather than the current one.
Unfortunately, there is no magic here that solves this for you. While it is possible to fire up that function in another process, it might not be worth the hassle, depending on what you're doing.
I recommend re-writing your function in such a way that work happens for a bit, and then continues on the next tick. Node ticks are very efficient... you could call them every iteration of a decent sized loop if needed, without a whole ton of overhead. Of course, you would have to profile it in your code to see what the impact is.
Yes, a blocking function will keep blocking even if you run it process.nextTick.
Some options:
If it truly takes a while, then perhaps it should be spun out to a queue where you can have a dedicated worker process handle it.
1a. Node.js has a child-process flavor specifically for forking other node.js files with a built in communication channel. So e.g. you can create one (or several) thread that handles these requests in order, then responds and hits the callback. See: http://nodejs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options
You can break up the blockingFunction into chunks that run in a loop. Have it call every X iterations with process.nextTick to make way for other events to be handled.

Resources