Treat async code as threads in Node.js? - node.js

Kind of a weird question, Imagine you have a situation where you need to run 10 SYNCRONOUS functions, it doesn't matter when they complete, you just want to know when all 10 are done: I.E.
f1()
f2()
f3()
...
f10()
doStuffWithResult();
Now, If you use promises like so, assuming you have rewrote each as promoises:
Promise.All([f1,f2,f3,f4,f5,f6,f7,f8,f9,f10])
.then(() => {
doStuffWithResult();
})
Would you see a performance increase? Theoretically, I want to say no because these functions are still synchronous, and everything is still running on one thread.
Thanks!

Would you see a performance increase?
No, what you are proposing would not be faster.
Promises do not create threads. All they do is provide a cooperative system for keeping track of when asynchronous operations are complete and then notifying interested parties of success or failure. They also provide services for propagating errors when asynchronous operations are nested.
And, your proposed Promise.all() code would not even work. You must pass an array of promises to Promise.all(), not an array of function references. In your example, your functions would not even be called.
And, if you changed your code to something that would actually execute, then it would likely be slower than just calling the synchronous functions directly because you'd be executing promise code and all .then() handlers execute on a future tick (not synchronously).
In node.js, the only way to execute synchronous things in parallel is to launch child processes (that can execute in parallel) or pass the operations to some native code that can use actual OS threads.

Related

What is the difference in using async.parallel() and Promise.all()? Can they be used interchangeably?

Are there any differences in execution when I use async.parallel() (the NPM async module) and Promise.all()? Both of them say that they start the callbacks/promises in parallel without waiting for the previous one to finish. So, can I use them interchangeably?
As far as I can tell they both will error as soon as a single promise is rejected. I would say the main differences are that:
Promise.all is native Javascript, so no package is required
async.parallel has a more flexible call signature in that you can pass an object of tasks and get back an object of results. Also, if you find yourself needing concurrency limits you can easily switch over to async.parallelLimit

Is it a bad idea to use async/await in a Node/Express server?

I have a NodeJS/Express web app, where TypeOrm is used for many database functions. To avoid callback hell I usually use async/await to call and wait for database actions from my endpoint methods.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait. Does this also apply to async/await? Do I have to use promises and callbacks to get decent multi-user performance?
Sync functions really BLOCK the event loop, until the io is complete. But in other hand async-await is just syntactic sugar to the Promise chaining. It is not actually synchronous, it just looks like it. That is the biggest difference you need to understand.
But, Some problems are introduces with async-await making the promises too sequential.
For example, two independent promises should be executed parallelly with Promise.all But when you use async-await you tend to do
await func1();
await func2();
So, logically you may make bottleneck yourself. But syntactically it has no problems like Sync ones whatsoever.
You can check out ES5 transpilation using Babel REPL, you may understand a little bit better.
The reason you shouldn't use *Sync functions is that they block the event loop. That causes your application to become irresponsive while synchronous functions are being executed.
Promises help you avoid the callback hell problem when you are dealing with async code. Additionally, async/await syntax allows you to write asynchronous code that LOOKS like synchronous code. So it's perfectly fine to use async and await.
I've heard, however, that methods like fs.readFileSync should always be avoided, since they are blocking and will force all other requests to wait.
This is true, mostly.
Remember that your server doesn't run on a single pipeline. Rather, you use the cluster API to run multiple worker processes side-by-side (where the number of cores of the server's CPU limits the number of worker processes that should be run).
In reality then, even if a single event loop is blocked by a synchronous IO and other requests that are assigned to the very same loop are forced to wait, other worker processes are still able to process incoming requests.
On the other hand, if you make a request await an IO operation, the loop is able to pick up another request and process it. But, there's a trade off.
Suppose first request comes, you await a fs.readFile. The second request comes and it's processed. But the second request doesn't wait for any IO, rather, it's a CPU-bound operation (a heavy calculation maybe?). The IO operation triggered by the first request completes but it has to wait until the second request completes and only then the continuation can be picked up by the event loop and the response can be sent back.
Do I have to use promises and callbacks to get decent multi-user performance?
A simple answer would be yes, however, be careful and monitor your app to not to fall in a pitfall (e.g. mixing IO requests with CPU intensive tasks where the performance of async IO could be worse from the client's perspective).
Using async/await in Node.js syntax is preferable to the alternatives, which are stock promises or especially callbacks. It allows for much cleaner code which is easier to understand and maintain. We used to have to use babel to transpile to access these in older times but they've been in Node for ages now so I'd recommend for people to use them.

Does Go have callback concept?

I found many talks saying that Node.js is bad because of callback hell and Go is good because of its synchronous model.
What I feel is Go can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Go and Node.js in callback perspective as if Go cannot become callback hell.
Or I misunderstand the meaning of callback and anonymous function in Go?
A lot of things take time, e.g. waiting on a network socket, a file system read, a system call, etc. Therefore, a lot of languages, or more precisely their standard library, include asynchronous version of their functions (often in addition to the synchronous version), so that your program is able to do something else in the mean-time.
In node.js things are even more extreme. They use a single-threaded event loop and therefore need to ensure that your program never blocks. They have a very well written standard library that is built around the concept of being asynchronous and they use callbacks in order to notify you when something is ready. The code basically looks like this:
doSomething1(arg1, arg2, function() {
doSomething2(arg1, arg2, function() {
doSomething3(function() {
// done
});
});
});
somethingElse();
doSomething1 might take a long time to execute (because it needs to read from the network for example), but your program can still execute somethingElse in the mean time. After doSomething1 has been executed, you want to call doSomething2 and doSomething3.
Go on the other hand is based around the concept of goroutines and channels (google for "Communicating Sequential Processes", if you want to learn more about the abstract concept). Goroutines are very cheap (you can have several thousands of them running at the same time) and therefore you can use them everywhere. The same code might look like this in Go:
go func() {
doSomething1(arg1, arg2)
doSomething2(arg1, arg2)
doSomething3()
// done
}()
somethingElse()
Whereas node.js focus on providing only asynchronous APIs, Go usually encourages you to write only synchronous APIs (without callbacks or channels). The call to doSomething1 will block the current goroutine and doSomething2 will only be executed after doSomething1 has finished. But that's not a problem in Go, since there are usually other goroutines available that can be scheduled to run on the system thread. In this case, somethingElse is part of another goroutine and can be executed in the meantime, just like in the node.js example.
I personally prefer the Go code, since it's easier to read and reason about. Another advantage of Go is that it also works well with computation heavy tasks. If you start a heavy computation in node.js that doesn't need to wait for network of filesystem calls, this computation basically blocks your event loop. Go's scheduler on the other hand will do its best to dispatch the goroutines on a few number of system threads and the OS might run those threads in parallel if your CPU supports it.
What I feel is Golang can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Golang and Node.js in callback perspective as if Golang cannot become callback hell.
Yes, of course it is possible to mess things up in Go as well. The reason why you don't see as much callbacks as in node.js is that Go has channels for communication, which allow for a way of structuring your code without using callbacks.
So, since there are channels, callbacks are not used as often therefore it is unlikely to stumble over callback infested code. Of course this doesn't mean that you cannot write scary code with channels as well...

Writing async javascript functions

I'm fairly familiar with nodejs now, but I have never tried to build a module before. I was curious to a bit abut async functions.
If you are writing a function that just returns a value, if it worth it to make it async for example, should this be written async?:
exports.getFilename = function () {
return filename;
}
Next, when writing a async function, is writing a function with a callback enough for performance, or is it recommended to thread it using a threading library as well?
Sorry for the somewhat obvious question, I noramlly am the one calling these functions
Callbacks and asynchronousness are two separate things though they are related since in javascript callbacks is the only mechanism to allow you to manage control flow in asynchronous code.
Weather or not non-asynchronous functions should accept callbacks depend on what the function does. One example of a type of function that is not asynchronous but is useful to supply a callback is iteration functions. Array.each() is a good example. Javascript doesn't allow you to pass code blocks so you pass functions to your iteration function.
Another example is filter functions that modify incoming data and return the modified version. Array.sort() is a good example. Passing a function to it allows you to apply your own conditions for how the array should be sorted.
Actually, filtering functions have a stronger reason for accepting functions/callbacks since it alters the behavior of the algorithm. Iteration functions are just nice syntactic sugar around for loops and are therefore a bit redundant. Though, they do make code nicer to read.
Weather or not a function should be asynchronous is a different matter. If it does something that takes a long time to compute (like I/O operations or large matrix calculations) then it should be made asynchronous. How long is "long" depends on your own tolerance. Generally for a moderately busy website a request shouldn't take more than 100ms to complete (in other words, you should be able to handle 10 hits per second at minimum). If an operation takes longer than that then you should split it up and make it async otherwise you'll risk making the site unresponsive to other users. For really busy websites you shouldn't tolerate operations that take longer than 10ms.
From the above explanation it should be obvious that just accepting a function or callback as an argument does not make a function asynchronous. The simplest pure-js way to make something async is to use setTimeout to break long calculations. Of course, the operation still happens in the same thread as the main Node process but at least it doesn't block other requests. To utilize multi-core CPUs on your servers you can use one of the threading libraries on NPM or clusters to make your function async.

Why is meteor.js synchronous?

Doesn't code take an efficiency hit by being synchronous? Why is coding synchronously a win?
I found these two links in doing some research: http://bjouhier.wordpress.com/2012/03/11/fibers-and-threads-in-node-js-what-for/, https://github.com/Sage/streamlinejs/
If the goal is to prevent spaghetti code, then clearly you can have asynchronous code, with streamline.js for example, that isn't a callback pyramid, right?
You have to distinguish two things here:
Synchronous functions like node's fs.readFileSync, fs.statSync, etc. All these functions have a Sync in their names (*). These functions are truly synchronous and blocking. If you call them, you block the event loop and you kill node's performance. You should only use these functions in your server's initialization script (or in command-line scripts).
Libraries and tools like fibers or streamline.js. These solutions allow you to write your code in sync-style but the code that you write with them will still execute asynchronously. They do not block the event loop.
(*) require is also blocking.
Meteor uses fibers. Its code is written in sync-style but it is non-blocking.
The win is not on the performance side (these solutions have their own overhead so they may be marginally slower but they can also do better than raw callbacks on specific code patterns like caching). The win, and the reason why these solutions have been developed, is on the usability side: they let you write your code in sync-style, even if you are calling asynchronous functions.
Jan 25 2017 edit: I created 3 gists to illustrate non-blocking fibers:
fibers-does-not-block.js, fibers-sleep-sequential.js, fibers-sleep-parallel.js
The code is not "synchronous" when using something like streamlinejs. The actual code will still run asynchronously. It's not very pretty to write lots of anonymous callback functions, thats where these things helps.

Resources