I understand that PM2 Cluster Mode allows us to easily scale across CPUs on a single machine. Does it create multiple instances of the node application it is scaling? Essentially, is it the same thing as running multiple node applications on different ports with a reverse proxy like Nginx?
Then, there's Node Cluster which forks a child process. Is this approach more efficient compared to PM2 Cluster Mode as it is running a single Node Application and using worker threads to process incoming requests?
they basically do the same, PM2 will use Node Cluster under the hood, it will make things easier since you don't have to programmatically handle forking in your code, just run it as is.
note that Cluster Mode will not support session stickiness so make sure your app is stateless.
Related
PM2 uses the node cluster module to run the application cluster mode.
The cluster module supports two methods of distributing incoming connections.
The first one (and the default one on all platforms except Windows) is the round-robin approach, where the primary process listens on a port, accepts new connections, and distributes them across the workers in a round-robin fashion, with some built-in smarts to avoid overloading a worker process.
The second approach is where the primary process creates the listen socket and sends it to interested workers. The workers then accept incoming connections directly.
So my question is, Which approach is used by PM2 to run the application in cluster mode?
If you know the answer let me know.
The PM2 process manager allows to launch nodejs processes in the fork and cluster modes.
I understand that the cluster mode allows to launch n processes where n is the number of cores in the machine. http,tcp or udp load is then automatically balanced between these processes.
I am wondering if this load balancing is also occurring for AMQP messaging traffic.
I have a bunch (around 10) of JavaScript scripts that consume message via RabbitMQ (which implements the amqp protocol), these scripts are launched by PM2 in cluster mode in a 4 core machine which brings us to 4 instances per script.
Is the cluster mode making any difference in the previous scenario?
Is there some form of load balancing taking place when using RabbitMQ?
Which one is better?
I have activated Nodejs clustering mode with workers but now I discovered PM2 that does the same thing.
I'm using keymetrics to see the stats from my webserver and I have noticed that when I launch my NodeJS node (with a built in cluster) without using PM2 cluster feature, Keymetrics reports 20/30MB of Ram used.
If I deactivate clustering (inside node) and I switch on PM2 cluster, keymetrics reports about 300MB of Ram usage.
Now, which method is better and why with a built in cluster keymetrics reports only 30MB of ram usage?
It actually depends on how your Node application works. If your application is stateless then it is easy to use pm2 cluster mode as it does not require much effort (or no effort) in code changes. But if your application uses local data, sessions or using sockets then it is recommended to use Node.js inbuilt cluster module and start your application normally using pm2.
My Node application is using sockets and MQTT so I can't directly use pm2 cluster mode (pm2 start app.js -i max) as same node application will run on every CPU and it was creating multiple socket connection with the client. So I have to manage Clusters and Workers manually using Node cluster and have to use sticky-sessions and socket.io-redis like node packages to setup proper communication flow between all workers. And then starting my node app using simply pm2 start app.js
Below are some links which can be helpful.
PM2 Clustur mode
PM2 Recommendation Note
Node Cluster
I use PM2. There are a number of reasons it is better.
Unlike using core's clustering, your code needs little to no modification to use PM2. Clustering logic doesn't belong in every app we ever build.
It scales from the command line. I can simply run pm2 scale my-app +1 to add another worker in realtime after deployment.
You should already be using PM2 anyway to keep the process alive. So clustering comes for free.
I cannot reproduce anything close to your 300MB number. In fact, I recently had a leaky app that I had to use --max-memory-restart on and even in that situation memory usage usually stayed below 100MB. Though it wouldn't surprise me in the slightest if PM2's clustering used more memory, simply because it does a lot for you out-of-the-box.
My suggestion would be to not prematurely optimize. Use PM2 until you genuinely need to squeeze every drop of memory / performance out of your systems (definitely not before you have lots of traffic). At that point you can figure out what the bare minimum is you need from clustering and can re-implement just those parts yourself.
Resources
Clustering walkthrough: https://keymetrics.io/2015/03/26/pm2-clustering-made-easy/
PM2 tutorial: https://futurestud.io/tutorials/pm2-cluster-mode-and-zero-downtime-restarts
I'm working on a project with Node.js that involves a server. Now due to large number of jobs, I need to perform clustering to divide the jobs between different servers (different physical machines). Note that my jobs has nothing to do do with internet, so I cannot use stateless connection (or redis to keep state) and a load balancer in front of the servers to distribute the connection.
I already read about the "cluster" module, but, from what i understood, it seems to scale only on multiprocessors on the same machine.
My question: is there any suitable distributed module available in Node.js for my work? What about Apache mesos? I have heard that mesos can abstract multiple physical machines into a single server? is it correct? If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Thanks
My question: is there any suitable distributed module available in Node.js for my work?
Don't know.
I have heard that mesos can abstract multiple physical machines into a single server? is it correct?
Yes. Almost. It allows you to pool resources (CPU, RAM, DISK) across multiple machines, gives you ability to allocate resources for your applications, run and manage the said applications. So you can ask Mesos to run X instances of node.js and specify how much resource does each instance needs.
http://mesos.apache.org
https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Admittedly, I don't know anything about node.js or clustering in node.js. Going by http://nodejs.org/api/cluster.html, it just forks off a bunch of child workers and then round robins the connection between them. You have 2 options off the top of my head:
Run node.js on Mesos using an existing framework such as Marathon. This will be fastest way to get something going on Mesos. https://github.com/mesosphere/marathon
Create a Mesos framework for node.js, which essentially does what cluster node.js is doing, but across the machines. http://mesos.apache.org/documentation/latest/app-framework-development-guide/
In both these solutions, you have the option of letting Mesos create as many instances of node.js as you need, or, use Mesos to run cluster node.js on each machine and let it manage all the workers on that machine.
I didn't google, but there might already be a node.js mesos framework out there!
Is it possible to build nodejs server with master/slave mode or cluster mode with only one CPU core, so that the others could be up once the current thread is down?
Yes, while using the core cluster module, you can spawn more children than cores. It is not recommended for regular use due context switching and the overhead incurred with new node processes.
However, if this is to load new code, an overall different approach is required. There are some existing modules that can assist with zero downtime reloads and they mostly proxy requests to new instances to perform the switch.