When should be used Synchronous Function in Nodejs? - node.js

The Node.js addons can be evoked in two ways: synchronous and asynchronous.
The two ways are possible, so I guess that the synchronous way could be more advantageous in some situation. What is that particular situation?

Pretty much the ONLY time I use the synchronous version of an IO function is in the startup phase of the server. It simplifies the startup code, but does not impact the ultimate scalability or performance of the server. For example, the built-in require() is synchronous for this reason.
During runtime when processing a request, you pretty much never want to use any synchronous function if there is an asynchronous alternative because it significantly reduces the scalability and performance of your server. It can add some coding complexity to always use the async version, but that extra complexity is what gives node.js it's performance and scalability.

Related

Node.js Express server: running res.render() / ejs.render() using Node.js threadpool

We have an app that uses server-side rendering for SEO purposes using EJS templating.
I am well-versed with Node.js and know that it's probably possible to tap into the Node.js threadpool for asynchronous I/O for whatever purpose you want, whether it's a good idea or a bad idea. Currently I am wondering if it is possible to run ejs.render() or res.render() with a thread in the threadpool instead of the main thread in Node.js?
We are doing a lot of heavy computational lifting in the render functions and we definitely want that off the main thread, otherwise we will be paying $$$ for more servers.
Is it just the rendering that is concerning you? There are other template engines which should produce better results; being that template rendering should be an idempotent operation, you could additionally distribute across a cluster.
V8 will compile your code to assembly and, if your not hitting any deoptimizations or getting stalled by the garbage collector, I believe you should be in the neighborhood of your network I/O limits. I would definitely recommend you try other template engines, adding a caching HTTP reverse proxy at the front and running some benchmarks first.
EJS is known to be synchronous, and that's not going to change, so basically it's an inefficient rendering engine for Node.js since it blocks the JS thread whenever it renders a view, which degrades your overall throughput, especially if your rendering is CPU heavy.
You should definitely think about some other options. E.g. https://github.com/ericf/express-handlebars
If you really have CPU-heavy computation in your webserver, then Node.js is definitely not the right tool for the job anyway. There are much better servers to handle multi-threading and parallel processing. You could just setup Node to be a controller and forward your CPU-heavy requests to a backend service/server that can do the heavy-lifting.
It would be helpful to see what kind of computation you are doing during render to provide a better answer.
Tapping into the thread-pool (which is handled by libuv) would probably be a bad idea, but it is possible of course.. you just need some C++ skills and the uv_queue_work() method of the libuv library to schedule stuff on a worker thread.
I have experimented with building a scripting engine that is run in a forked process (Read on node's child process module here). I find that to be an attractive proposition for implementing rendering engines. Yes there are issues of passing parameters (post/get query strings, session status, etc) but they are easy to deal with, especially if you use the fork option (as opposed to exec or spawn). There are standard messaging methods to communicate between the child and parent.
The only overhead is the generation of the additional instance of node (the rendering engine itself). If you are doing extensive computation in the scripting engine then this constant, the one-time per rendering request overhead of forking a new process will be minor compared to the time taken to render.
If EJS rendering blocks the main node thread, then that alone is sufficient reason NOT to use it if you are doing any significant computation during rendering.

Can Node.js become the bottleneck in a MEAN stack?

If I were writing an application using the MEAN stack, and the database is optimized sufficiently enough to almost never be the bottleneck, can Node.js itself become a bottleneck due to the site traffic and/or number of concurrent users? This is purely from the perspective of Node.js being an asynchronous, single threaded event loop. One of the first tenets of Node.js development is to avoid writing code that performs CPU intensive tasks.
Say, if I had to post-process the data returned from MongoDB and that was even moderately CPU intensive, it sounds like that should be handled by a service layer sitting in between Node.js and MongoDB, that is not pounding the same CPU dedicated to Node.js. Techniques such as process.nextTick() are harder to comprehend and more importantly to realize when to use them.
Forgive me for this borderline rant, but I really do want to have a better idea of Node.js' strengths and weaknesses.

Scala and Node.js

We chose Node.js for our web project, but there are many computational tasks for which we would prefer Scala. We are highly concerned about speed, what is the best way to call a Scala "worker" from Node.js in an asynchronous non-blocking way?
When queuing jobs its best to have some kind of Broker like a message queue or a job queue. Redis is a popular choice, as it can also be used for caching, and storing data in memory. RabbitMQ is another common choice. The nice thing about having a Broker is it can hold the job until a worker pulls it out of queue when ever it has available resources. A broker also acts as a load balancer in a sense, where it holds jobs and you can have multiple worker nodes grabbing jobs allowing for high availability, scalability, and parallel processing.
You probably should not be so concerned about speed; in my experience concerns like readability and maintainability are more important in almost all projects.
For short-lived "remote procedure calls" of at most a few seconds, I would tend to use Apache Thrift, which has libraries for Javascript and the JVM (Scrooge is an alternative Scala implementation, oriented towards writing async backends using Twitter's Finagle futures library), allowing nonblocking calls; by using Thrift you get strongly typed interface definitions that are engineered for forward compatibility, and you know exactly what changes you can make to the interface without breaking compatibility.
Alternatively one could use an ordinary HTTP ("REST") interface; node is oriented towards making async HTTP calls, and libraries like Spray make it easy to offer a high-performance, async HTTP interface in Scala.
For longer-running "batch" tasks where you're less concerned about latency and more about reliability, it's probably better to use a dedicated task queue as #tsutrzl suggests.

Instances where Node.js operations need to be synchronous

JavaScript is single-threaded (besides web workers and spawning multiple processes), and it's best not to wait for long-running operations as it blocks the thread. But still, when taking a peek into several modules in Github, they actually use these syncrhonous operations, most of the time in file operations.
Am I staring at bad code/practice? Or is there some real need for synchronous operations in JavaScript that I am not aware of?
Can you post an example? You are most likely looking at:
a call that is actually asynchronous such as fs.read. Note that all synchronous calls in the node core API end with the word "Sync" like fs.readSync.
synchronous code such as require('somemodule') that runs before the application begins accepting network requests
And yes, if you are seeing code doing something like fs.readSync while responding to an HTTP request, that is bad code/practice and that application will lock up while that synchronous operation happens.
Node.js it's not single-threaded, uses a pool of threads but they are exposed as a single thread to the javascript layer, otherwise would be impossible to write asynchronous code. Any I/O call blocks the current thread.
Threads are used internally to fake the asynchronous nature of all the
system calls. libuv also uses threads to allow you, the application,
to perform a task asynchronously that is actually blocking, by
spawning a thread and collecting the result when it is done.
http://nikhilm.github.com/uvbook/threads.html
Node.js has decided to include synchronous functions to maintain a similitude with other common languages, but they shouldn't be used, never, never, never!
Node.js in its nature is asynchronous, it's pure javascript. Javascript is synonym of closure, of callback. If you want to write synchronous code with Node.js perhaps you should try another scripting language like python.
There's an excellent module called async that eases the pain of nested callbacks. Then, why should I use synchronous code? Silly. If I use synchronous code I lose all the benefits that Node provides to me. The only exception are CLI apps, but again, I'll prefer to write all the code asynchronously. It's not really hard.
At some level, a synchronous operation has to occur. Node.js puts those in threads so that the whole server isn't blocked.

When to use asynchronous operations in asio

When should I use asynchronous operations in boost::asio instead of synchronous operations in seperate threads?
Does the Rationale section help?
Most programs interact with the outside world in some way, whether it be via a file, a network, a serial cable, or the console. Sometimes, as is the case with networking, individual I/O operations can take a long time to complete. This poses particular challenges to application development.
Boost.Asio provides the tools to manage these long running operations, without requiring programs to use concurrency models based on threads and explicit locking.
I would strongly urge you to use a asynchronous approach whenever possible. A asynchronous call doesn't necessarily create a thread, so by sticking with an asynchronous operation you may reduce the overhead that is associated with threads. In addition threads are usually harder to develop and maintain.
Hope it helps.

Resources