How can I safely restart a Node application which receives a high volume of traffic? - node.js

For example, in the Python world you would use uWSGI or Gunicorn to restart your Python web app if stopped running for any reason, e.g. memory leaks, unexpected runtime errors, etc. However this is done in such a way that connections aren't dropped (so no 502s).
Looking at the options for Node it seems PM2 is a popular choice but I have two concerns:
Can it make the same guarantees regarding connection draining (no 502s, please)?
When I looked at PM2 before it seemed to cause significant performance degradation in my application where every millisecond of latency counts (100s of added ms).
So my question is, where performance is a serious consideration and we can't drop connections while restarting, what are Node's uWSGI and Gunicorn equivalents?

Here are some strategies:
Use node.js clustering with N worker processes. You can then restart any single worker process and not affect overall availability.
Use a load balancer in front of multiple clusters. Then temporarily configure the load balancer to only send traffic to one cluster. When the deconfigured cluster has finished with all open connnections, you can then restart all the processes in that cluster.
For even more flexibility, use multiple clusters on separate machines. That allows you to even take a server machine down for hardware maintenance without disrupting overall availability.
If you have resources among multiple clustered processes such as databases, then you will also need redundancy for them in order to be able to restart them without interruption.
Now of course, you have to make sure that taking some part of your system out of service for reboot or maintenance still leaves you with enough service capacity so you would typically do this when overall service load is low (4am for your largest user base).
PM2 is one such tool that allows you to do portions of what is recommended here (such as clustering and seamlessly restarting part of a cluster). There are other tools.


pm2 safe way to use max_memory_restart

I'm building a Node.js + Express web application using pm2 cluster mode as a load balancer. This turned out to be a big performance improvement, as my application now spawns an instance of itself for each one of my CPU cores.
To make the most advantage of it, i'm using a custom start script-- in which I added pm2's max_memory_restart option, so if one of the instances exceed 400mb memory usage it restarts itself. Seeing that behavior in action, I couldn't avoid myself to question if it is safe to use this option. Although it's nice to have an auto-restart kick in when memory grows over certain point, I thought of two possible downsides:
If one of my endpoints has memory intensive usage, said instance could restart itself in the middle of processing giving the user an error
If my server has, let's say, 2GB of RAM and 8 CPU cores, then the max_memory_restart option should be max 256mb if I'm running pm2 in cluster mode, as it applies for each instance. Isn't there any risk giving a fairly low max_memory_restart value here? Theoretically the instances would be restarting frequently in this case
Given these scenarios, Is it safe/adequate to use pm2's max_memory_restart option?

Any benefits when clustering node.js application server in single core computer

This question is meant for single core only therefore multiple cores are out.
I am running Node.js application as HTTP server on single core computer using Express.js. Assuming that my server is able to handle 1000 concurrent requests, would clustering brings about any better in response speed ?
Does process context switch has much impact on performance in this case ?
I wouldn't expect improvements in speed, but you might get some other benefits.
for example if the process crashes, the other nodejs instance could still work.
would clustering brings about any better in response speed ?
You are aware of the fact that in clustered mode, your application will be running behind a load balancer, that will in turn take some CPU and memory to manage and forward the network traffic. Then, what's left of the resources, will be used to distribute the network load.
Apart from a few, rare and easily avoidable cases such as #Cristyan mentions—in which case your load balancer can be an orchestrator managing most of the stuff, like Kubernetes—running a Node.js app in cluster does not make sense to me, on a single CPU core. If the process has to wait for an item, it has to wait for it! Asynchronously you can make it work on other requests, but even in this case, other processes would want to take a share of CPU too.

How to use clusters in node js?

I am very new to Node.js and express. I am currently learning it by building my own services.
I recently read about clusters. I understood what clusters do. What I am not able to understand is how to make use of clusters in a production application.
One way I can think of is to use the Master process to just sit in front and route the incoming request to the next available child process in a round robin fashion. I am not sure if this is how it is designed to be used. I would like to know how should clusters be used in a typical web application.
The node.js cluster modules is used with node.js any time you want to spread out the request processing across multiple node.js processes. This is most often used when you wish to increase your ability to handle more requests/second and you have multiple CPU cores in your server. By default, a single instance of node.js will not fully utilize multiple cores because the core Javascript you run in your server is single threaded (uses one core). Node.js itself does use threads for some things internally, but that's still unlikely to fully utilize a mult-core system. Setting up a clustered node.js process for each CPU core will allow you to better maximize the available compute resources.
Clustering also provides you with some additional fault tolerance. If one cluster process goes down, you can still have other live clusters serving requests while the disabled cluster restarts.
The cluster module for node.js has a couple different scheduling algorithms - the round robin you mention is one. You can read more about that here: Cluster Round-Robin Load Balancing.
Because each cluster is a separate process, there is no automatic shared data among the different cluster processes. As such, clustering is simplest to implement either where there is no shared data or where the shared data is already in a place that it can be accessed by multiple processes (such as in a database).
Keep in mind that a single node.js process (if written to properly use async I/O and not heavily compute bound) can server many requests itself at once. Clustering is when you want to expand scalability beyond what one instance can deliver.
I have created a poc on cluster in nodejs and added some details in the below blogs. Once go through it. It may provide some clearance.

I'm not sure how to correctly configure my server setup

This is kind of a multi-tiered question in which my end goal is to establish the best way to setup my server which will be hosting a website as well as a service (using for an iOS (and eventually an Android) app. Both the app service and the website are going to be written in node.js as I need high concurrency and scaling for the app server and I figured whilst I'm at it may as well do the website in node because it wouldn't be that much different in terms of performance than something different like Apache (from my understanding).
Also the website has a lower priority than the app service, the app service should receive significantly higher traffic than the website (but in the long run this may change). Money isn't my greatest priority here, but it is a limiting factor, I feel that having a service that has 99.9% uptime (as 100% uptime appears to be virtually impossible in the long run) is more important than saving money at the compromise of having more down time.
Firstly I understand that having one node process per cpu core is the best way to fully utilise a multi-core cpu. I now understand after researching that running more than one per core is inefficient due to the fact that the cpu has to do context switching between the multiple processes. How come then whenever I see code posted on how to use the in-built cluster module in node.js, the master worker creates a number of workers equal to the number of cores because that would mean you would have 9 processes on an 8 core machine (1 master process and 8 worker processes)? Is this because the master process usually is there just to restart worker processes if they crash or end and therefore does so little it doesnt matter that it shares a cpu core with another node process?
If this is the case then, I am planning to have the workers handle providing the app service and have the master worker handle the workers but also host a webpage which would provide statistical information on the server's state and all other relevant information (like number of clients connected, worker restart count, error logs etc). Is this a bad idea? Would it be better to have this webpage running on a separate worker and just leave the master worker to handle the workers?
So overall I wanted to have the following elements; a service to handle the request from the app (my main point of traffic), a website (fairly simple, a couple of pages and a registration form), an SQL database to store user information, a webpage (probably locally hosted on the server machine) which only I can access that hosts information about the server (users connected, worker restarts, server logs, other useful information etc) and apparently nginx would be a good idea where I'm handling multiple node processes accepting connection from the app. After doing research I've also found that it would probably be best to host on a VPS initially. I was thinking at first when the amount of traffic the app service would be receiving will most likely be fairly low, I could run all of those elements on one VPS. Or would it be best to have them running on seperate VPS's except for the website and the server status webpage which I could run on the same one? I guess this way if there is a hardware failure and something goes down, not everything does and I could run 2 instances of the app service on 2 different VPS's so if one goes down the other one is still functioning. Would this just be overkill? I doubt for a while I would need multiple app service instances to support the traffic load but it would help reduce the apparent down time for users.
Maybe this all depends on what I value more and have the time to do? A more complex server setup that costs more and maybe a little unnecessary but guarantees a consistent and reliable service, or a cheaper and simpler setup that may succumb to downtime due to coding errors and server hardware issues.
Also it's worth noting I've never had any real experience with production level servers so in some ways I've jumped in the deep end a little with this. I feel like I've come a long way in the past half a year and feel like I'm getting a fairly good grasp on what I need to do, I could just do with some advice from someone with experience that has an idea with what roadblocks I may come across along the way and whether I'm causing myself unnecessary problems with this kind of setup.
Any advice is greatly appreciated, thanks for taking the time to read my question.

Strategies for scale a nodeJS application?

I have an app in NodeJS.
Recently we have been getting a lot more traffic (this is a new experience for me) and so I have been running into the "EMFILE: too many open files" error that is caused when a single process tries to open more files than the filesystem allows.
I have increased this limit, so we are good for now. However I'm not sure how long this solution will last...
I am wondering: What are other commonly used options for scaling a Node Application that is getting increasing amounts of traffic? (specifically with a mind to the open files limit problem.)
The PM2 process manager which allows clustering catches my eye (am I correct in understanding that every instance of the application requires it's own core -- ie you can't run 4 instances on a single core?). Are there any other techniques that are regularly used?
Thanks (in advance)
PM2 is a simple solution when you want to run more than one instance of Node, another common alternative is the cluster module Keep in mind, that you will need to configure another http server such as Nginx to reverse proxy your user requests to your Node processes.
You can run any number of Node processes, regardless of the amount of cores. But since each node process is a single thread, and each core can execute a single thread a time, the optimal configuration is when the number of cores match the number of Node processes. If the number of Node processes is greater than the number of cores, under load, you will experience reduced performance due to redundant context switches your processor will have to perform.
