NodeJS in MultiCore System - multithreading

"Node.js is limited to a single thread". how the nodeJS will react when we are deploying in Multi-Core systems? will it boost the performance?

The JavaScript running in the Node.js V8 engine is single-threaded, but the underlying libuv multi-platform support library is multi-threaded and those threads will be distributed across the CPU cores by the operating system according to it's scheduling algorithm, so with your JavaScript application running asynchronously (and single-threaded) at the top level, you still benefit from multi-core under the covers.
As others have mentioned, the Node.js Cluster module is an excellent way to exploit multi-core for concurrency at the application (JavaScript V8) level, and since Express is cluster aware, you can have multiple worker processes executing concurrent server logic, without needing a unique listening port for each process. Impressive.
As others have mentioned, you will need Redis or equivalent to share data among the cluster worker processes. You will also want a logging facility that is cluster aware, so the cluster master and all worker processes can log to a single shared log file. The Node log4node module is a good choice here, and it works with logrotate.
Typical web examples show using the runtime detected number of cores as the number of cluster worker processes to fork, but I prefer to make that a configuration option in a config.yaml file so I can tune the number of worker processes running the main JavaScript application as needed.

Nodejs runs in one thread, but you can start multiple nodejs processes.
If you are, for example, building web server you can route every request to one of nodejs processes.
Edit: As hereandnow78 and vkurchatkin suggested, maybe the best way to use power of multi core system would be to use nodejs cluster module

cluster module is the solution.
But u need to know that, node.js cluster is, it invokes child process. It means each process cannot share the data.
To share data, u need to use Redis or other IMDG to share the data across the cluster nodes.

Related

NodeJS Clustering and Worker Threads

I am doing some research for a home project and I'm looking into the Cluster Module and Worker Threads.
I know the difference between Cluster and Worker Threads.
My question is:
In NodeJS is it possible to use Clustering and Worker Threads at the same time?
I'm guessing you're thinking of the worker threads module when you're referring to "worker threads". I'm making a clear distinction since NodeJS runtime comes with 4 worker threads by default. cluster (module) and worker_threads shouldn't have any problems working in parallel as clustering provides you with multiple independent NodeJS instances, as in, multiple NodeJS processes which have their own threads as stated before. Spawning more worker threads using the above mentioned worker_threads module spawns more threads which aren't independent (they can and do share memory), which can be good if you're doing some crunching, but they all run under a single Node process.
Processes can communicate using IPC, and Node worker threads can talk using the MessagePort class from the same module.
Therefore, yes, you can do that and my best guess is that, on the top of the "supervision tree" of sorts you spawn a couple of Node processes (using the cluster module) to distribute the load if you have them acting as servers (no clue about your use case), and then for each process you can use the worker_threads module to spawn additional threads if needed (to speed up some heavy processing etc).

How to fix NodeJS underutilizing CPU Cores?

According to this page Go vs Node.js, Node.js is not showing to be taking full advantage of CPU cores when running cpu-intensive code.
If I use virtualization and simply add more Node.js instances, will I achieve the same performance as Go? I suppose there still will be overheads and one won't be able to achieve the same performance.
Multiple processes will do. For 4 cpus/threads you need 4 Node.js processes to make use of them. That requires a workload that can be split between processes though.
Node.js provides the Cluster module to distribute socket connections between multiple worker processes which may help in some workloads, but I doubt this would help any of the benchmark workloads.
You need to install Cluster module of nodejs in order to take full advantage of CPU cores when running cpu-intensive code.
Node.js is indeed a single threaded process. However, you can make use of clustering to spawn multiple workers on multiple cores.
If you are not using PM2 for spawning the app, please consider using it. PM2 makes spawning workers on multiple cores super easy.
pm2 start app.js -i max
This command will spawn workers on each available core.
Also, note that if you are using session/sockets then you might face some problems because of clustering.

How to use clusters in node js?

I am very new to Node.js and express. I am currently learning it by building my own services.
I recently read about clusters. I understood what clusters do. What I am not able to understand is how to make use of clusters in a production application.
One way I can think of is to use the Master process to just sit in front and route the incoming request to the next available child process in a round robin fashion. I am not sure if this is how it is designed to be used. I would like to know how should clusters be used in a typical web application.
Thanks.
The node.js cluster modules is used with node.js any time you want to spread out the request processing across multiple node.js processes. This is most often used when you wish to increase your ability to handle more requests/second and you have multiple CPU cores in your server. By default, a single instance of node.js will not fully utilize multiple cores because the core Javascript you run in your server is single threaded (uses one core). Node.js itself does use threads for some things internally, but that's still unlikely to fully utilize a mult-core system. Setting up a clustered node.js process for each CPU core will allow you to better maximize the available compute resources.
Clustering also provides you with some additional fault tolerance. If one cluster process goes down, you can still have other live clusters serving requests while the disabled cluster restarts.
The cluster module for node.js has a couple different scheduling algorithms - the round robin you mention is one. You can read more about that here: Cluster Round-Robin Load Balancing.
Because each cluster is a separate process, there is no automatic shared data among the different cluster processes. As such, clustering is simplest to implement either where there is no shared data or where the shared data is already in a place that it can be accessed by multiple processes (such as in a database).
Keep in mind that a single node.js process (if written to properly use async I/O and not heavily compute bound) can server many requests itself at once. Clustering is when you want to expand scalability beyond what one instance can deliver.
I have created a poc on cluster in nodejs and added some details in the below blogs. Once go through it. It may provide some clearance.
https://jksnu.blogspot.com/2022/02/cluster-in-node-js-application.html
https://jksnu.blogspot.com/2022/02/cluster-management-in-node-js.html

What are the effective differences between child_process.fork and cluster.fork?

I understand that cluster.fork will allow for multiple processes to listen on the same port(s), what I also want to know is how much additional overhead is there in supporting this when some of your workers are not listeners/handlers for the tcp service?
I have a service that I also want to launch a couple of workers.. ex: 2 web service listener processes, and 3 worker instances. Is it best to use cluster for them all, or would cluster for the 2 web services, and child_process for the workers be better?
I don't know the internals in node, but think it would be nice for myself and others to have a better understanding of which route to take given different needs. For now, I'm using cluster for all the processes.
cluster.fork is implemented on top of child_process.fork. The extra stuff that cluster.fork brings is that, it will enable you to listen on a shared port. If you don't want it, just use child_process.fork. So yeah, use cluster for web servers and child_process for workers.
Cluster is a module of Node.js that contains sets of functions and properties that helps the developers for forking processes through which they can take advantage of the multi-core system.
With the cluster module, the creation and sharing of child processes and several parts become easy. In a single thread, the individual instance of node.js runs specifically and to take advantage of various ecosystems, a cluster of node.js is launched, to distribute the load.
A developer can access Operating System functionalities by the child_process module, this happens by running any system command inside a child process. The child process input streams can be controlled and the developer can also listen to the output stream.
A child process can be easily spun using Node’s child_process module and these child processes can easily communicate with each other with the help of a messaging system

If nodejs is multithreaded why should i use cluster module to utilize multicore cpu?

if nodejs is multithreaded see
this article and
threads are managed by OS which can do it in the same core or in another core in multicore cpu see this question then nodejs will automatically utilize multicore cpu ,
so why should i use cluster.fork to make different process of node to utilize multicore as shown in this example at node docs
i know that multiprocess have the advantage that when one process fall there still another process to respond to requests unlike in threads , i need to know if multicore can be utilized by just spawning process for each core or it's an OS task that i can't control
It depends.
Work that happens asynchronously and by Node itself, such as IO operations, is multithreaded. Your JavaScript application runs in a single thread.
In my opinion, the only time you need to fire off multiple processes, is if the vast majority of your work is done in straight JavaScript. Node was designed behind the fact that this is rarely the case, and is built for applications that primarily block on disk and network.
So, if you have a typical Node application where your JavaScript isn't the bulk of the work, then firing off multiple processes will not help you utilize multiple CPUs/cores.
However, if you have a special application where you do lots of work in your main loop, then multiple processes may be for you.
The easiest way to know is to monitor CPU utilization while your application runs. You will have to decide on a per-application basis what is best.
Node is not multi-threaded from the point of developer's view. Threads are used in a very different way than they are used by for example Apache's worker mpm.
I believe this answer will clear things up.

Resources