PM2 process manager for Node across multiple cores - node.js

I have a simple stateless Node app that I want to instantiate across a multi-core (multi-vCPU AWS instance) server, and I understand how PM2's cluster mode works to obviate the need for using the Cluster module in the app code.
I have a dual core AWS t2.medium EC2 instance, PM2 I believe is configured correctly and at startup it invokes two processes for the app with distinct PM2 IDs and PIDs.
PM2 is starting the app as follows:
pm2 start [app_name] -i max
PM2 lists the two processes with distinct PM2 IDs and distinct PIDs as expected.
However...
ps -U [username] -au
...suggests both processes are running on the same core.
Am I missing something? (Probably!)
Thanks in advance to anyone who can shed some light on this.

Processes aren't bound to cores, but rather assigned by the OS's scheduler. When your clustered program comes under load, the OS will use both cores to schedule your processes and, of course, all the other stuff it needs to run.

Related

Spawning Node.js processes as many as CPU cores and load-balancing across them

I have several machines. For each of them, I'd like to spawn immortal Node.js processes as many as available CPU cores with PM2, and distribute ingress HTTP traffic across them with Nginx, not using PM2's built-in load-balancing feature, namely the "clustering mode."
For PM2
How do I spawn Node.js processes as many as CPU cores with PM2, without its built-in load-balancing?
For Nginx
I could manually specify all the ports to the processes for Nginx, but it's cumbersome to maintain. How do I have it automatically configured as the number of CPU cores changes?

Aggregate metrics in child workers for pm2 cluster mode

I use PM2 to start my application in cluster mode. But as I know, in that case PM2 does not allow to run my code in master process, but I need to collect metrics (CPU usage, memory etc.).
Is it possible to aggregate metrics or get metrics for whole app (PM2 cluster mode) in child workers and, for example, show these metrics on /metrics route?
Unfortunately, I cannot to find any open source libs for that :(
I found pm2 web + pmx. It is solved my problem.

PM2 Cluster Mode vs. Node Cluster Performance

I understand that PM2 Cluster Mode allows us to easily scale across CPUs on a single machine. Does it create multiple instances of the node application it is scaling? Essentially, is it the same thing as running multiple node applications on different ports with a reverse proxy like Nginx?
Then, there's Node Cluster which forks a child process. Is this approach more efficient compared to PM2 Cluster Mode as it is running a single Node Application and using worker threads to process incoming requests?
they basically do the same, PM2 will use Node Cluster under the hood, it will make things easier since you don't have to programmatically handle forking in your code, just run it as is.
note that Cluster Mode will not support session stickiness so make sure your app is stateless.

PM2 NodeJs Cluster Mode

I have 4 ec2 instances running on AWS. PM2 is running in cluster mode on all instances. When I get 5K+ Concurrent request, response time of app increases significantly.
All requests fetch redis key, and a normal fetch takes upto 10 seconds which without so many concurrent requests takes only 50ms. What can be issue here?
We need to pinpoint the bottleneck. Let's do some diagnostics:
Are the EC2 instances multicore to take advantage of PM2's clustering?
When you execute pm2 start app.js -i X are you sure X=number_of_vCPUs of EC2 instance?
When you execute pm2 monit do you see all instances of the cluster sharing the equal CPU and memory usage?
When you run htop what is your total CPU and memory usage %?
When you execute iftop what is your total of your RX and TX traffic compared to the maximum available in your machine?

Node.JS built in cluster or PM2 clustering?

Which one is better?
I have activated Nodejs clustering mode with workers but now I discovered PM2 that does the same thing.
I'm using keymetrics to see the stats from my webserver and I have noticed that when I launch my NodeJS node (with a built in cluster) without using PM2 cluster feature, Keymetrics reports 20/30MB of Ram used.
If I deactivate clustering (inside node) and I switch on PM2 cluster, keymetrics reports about 300MB of Ram usage.
Now, which method is better and why with a built in cluster keymetrics reports only 30MB of ram usage?
It actually depends on how your Node application works. If your application is stateless then it is easy to use pm2 cluster mode as it does not require much effort (or no effort) in code changes. But if your application uses local data, sessions or using sockets then it is recommended to use Node.js inbuilt cluster module and start your application normally using pm2.
My Node application is using sockets and MQTT so I can't directly use pm2 cluster mode (pm2 start app.js -i max) as same node application will run on every CPU and it was creating multiple socket connection with the client. So I have to manage Clusters and Workers manually using Node cluster and have to use sticky-sessions and socket.io-redis like node packages to setup proper communication flow between all workers. And then starting my node app using simply pm2 start app.js
Below are some links which can be helpful.
PM2 Clustur mode
PM2 Recommendation Note
Node Cluster
I use PM2. There are a number of reasons it is better.
Unlike using core's clustering, your code needs little to no modification to use PM2. Clustering logic doesn't belong in every app we ever build.
It scales from the command line. I can simply run pm2 scale my-app +1 to add another worker in realtime after deployment.
You should already be using PM2 anyway to keep the process alive. So clustering comes for free.
I cannot reproduce anything close to your 300MB number. In fact, I recently had a leaky app that I had to use --max-memory-restart on and even in that situation memory usage usually stayed below 100MB. Though it wouldn't surprise me in the slightest if PM2's clustering used more memory, simply because it does a lot for you out-of-the-box.
My suggestion would be to not prematurely optimize. Use PM2 until you genuinely need to squeeze every drop of memory / performance out of your systems (definitely not before you have lots of traffic). At that point you can figure out what the bare minimum is you need from clustering and can re-implement just those parts yourself.
Resources
Clustering walkthrough: https://keymetrics.io/2015/03/26/pm2-clustering-made-easy/
PM2 tutorial: https://futurestud.io/tutorials/pm2-cluster-mode-and-zero-downtime-restarts

Resources