Cluster or nginx? Cluster Stability: 1 - Experimental - node.js

Cluster Stability: 1 - Experimental
Currently I'm working with node.js. Are you guys using Cluster in production? Shall I go with nginx and run two node process in production? Please suggest.

I tried cluster with single node process, but didn't get much performance. :(
Did lots of trial and error with performance computing and response time. Finally fixed to work with N two-core machines in EC2. Running two node process in different ports in each machine. Configured nginx in each machine to route the requests to two node process running in different ports. Finally put all the machines under an ELB. Happy time :)

Related

PM2 Cluster Mode vs. Node Cluster Performance

I understand that PM2 Cluster Mode allows us to easily scale across CPUs on a single machine. Does it create multiple instances of the node application it is scaling? Essentially, is it the same thing as running multiple node applications on different ports with a reverse proxy like Nginx?
Then, there's Node Cluster which forks a child process. Is this approach more efficient compared to PM2 Cluster Mode as it is running a single Node Application and using worker threads to process incoming requests?
they basically do the same, PM2 will use Node Cluster under the hood, it will make things easier since you don't have to programmatically handle forking in your code, just run it as is.
note that Cluster Mode will not support session stickiness so make sure your app is stateless.

Node.JS built in cluster or PM2 clustering?

Which one is better?
I have activated Nodejs clustering mode with workers but now I discovered PM2 that does the same thing.
I'm using keymetrics to see the stats from my webserver and I have noticed that when I launch my NodeJS node (with a built in cluster) without using PM2 cluster feature, Keymetrics reports 20/30MB of Ram used.
If I deactivate clustering (inside node) and I switch on PM2 cluster, keymetrics reports about 300MB of Ram usage.
Now, which method is better and why with a built in cluster keymetrics reports only 30MB of ram usage?
It actually depends on how your Node application works. If your application is stateless then it is easy to use pm2 cluster mode as it does not require much effort (or no effort) in code changes. But if your application uses local data, sessions or using sockets then it is recommended to use Node.js inbuilt cluster module and start your application normally using pm2.
My Node application is using sockets and MQTT so I can't directly use pm2 cluster mode (pm2 start app.js -i max) as same node application will run on every CPU and it was creating multiple socket connection with the client. So I have to manage Clusters and Workers manually using Node cluster and have to use sticky-sessions and socket.io-redis like node packages to setup proper communication flow between all workers. And then starting my node app using simply pm2 start app.js
Below are some links which can be helpful.
PM2 Clustur mode
PM2 Recommendation Note
Node Cluster
I use PM2. There are a number of reasons it is better.
Unlike using core's clustering, your code needs little to no modification to use PM2. Clustering logic doesn't belong in every app we ever build.
It scales from the command line. I can simply run pm2 scale my-app +1 to add another worker in realtime after deployment.
You should already be using PM2 anyway to keep the process alive. So clustering comes for free.
I cannot reproduce anything close to your 300MB number. In fact, I recently had a leaky app that I had to use --max-memory-restart on and even in that situation memory usage usually stayed below 100MB. Though it wouldn't surprise me in the slightest if PM2's clustering used more memory, simply because it does a lot for you out-of-the-box.
My suggestion would be to not prematurely optimize. Use PM2 until you genuinely need to squeeze every drop of memory / performance out of your systems (definitely not before you have lots of traffic). At that point you can figure out what the bare minimum is you need from clustering and can re-implement just those parts yourself.
Resources
Clustering walkthrough: https://keymetrics.io/2015/03/26/pm2-clustering-made-easy/
PM2 tutorial: https://futurestud.io/tutorials/pm2-cluster-mode-and-zero-downtime-restarts

clustering in node.js using mesos

I'm working on a project with Node.js that involves a server. Now due to large number of jobs, I need to perform clustering to divide the jobs between different servers (different physical machines). Note that my jobs has nothing to do do with internet, so I cannot use stateless connection (or redis to keep state) and a load balancer in front of the servers to distribute the connection.
I already read about the "cluster" module, but, from what i understood, it seems to scale only on multiprocessors on the same machine.
My question: is there any suitable distributed module available in Node.js for my work? What about Apache mesos? I have heard that mesos can abstract multiple physical machines into a single server? is it correct? If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Thanks
My question: is there any suitable distributed module available in Node.js for my work?
Don't know.
I have heard that mesos can abstract multiple physical machines into a single server? is it correct?
Yes. Almost. It allows you to pool resources (CPU, RAM, DISK) across multiple machines, gives you ability to allocate resources for your applications, run and manage the said applications. So you can ask Mesos to run X instances of node.js and specify how much resource does each instance needs.
http://mesos.apache.org
https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Admittedly, I don't know anything about node.js or clustering in node.js. Going by http://nodejs.org/api/cluster.html, it just forks off a bunch of child workers and then round robins the connection between them. You have 2 options off the top of my head:
Run node.js on Mesos using an existing framework such as Marathon. This will be fastest way to get something going on Mesos. https://github.com/mesosphere/marathon
Create a Mesos framework for node.js, which essentially does what cluster node.js is doing, but across the machines. http://mesos.apache.org/documentation/latest/app-framework-development-guide/
In both these solutions, you have the option of letting Mesos create as many instances of node.js as you need, or, use Mesos to run cluster node.js on each machine and let it manage all the workers on that machine.
I didn't google, but there might already be a node.js mesos framework out there!

npm cluster package on a server cluster

So I have an app I am working on and I am wondering if I am doing it correctly.
I am running cluster on my node.js app, here is a link to cluster. I couldn't find anywhere that states if I should only run cluster on a single server or if it is okay to run it on a cluster of servers. If I continue down the road I am going I will have a cluster inside a cluster.
So that it is not just opinions as answers, here is my question. Was cluster the package made to do what I am doing (cluster of workers on a single server inside a cluster of servers)?
Thanks in advance!
Cluster wasn't specifically designed for that, but there is nothing about it which would cause a problem. If you've designed your app to work with cluster, it's a good indication that your app will also scale across multiple servers. The main gotcha would be if you're doing anything stateful on the filesystem. For example, if a user uploads a photo and you store it on the server disk, that would be problematic when scaling out across multiple servers (that don't share the same disk).

Elasticsearch deployment in a 2 server load balanced node js application setting

I have the following production setup for my Node JS application:
I am now going to integrate Elasticsearch in this setup. My question is regarding the best practices for deploying Elasticsearch in a production environment. All my instances are virtual machines, and I understand that Elasticsearch uses a lot of memory.
Should I therefore set up Elasticsearch on its own server (server 3), set it up on both server 1 and server 2 as a cluster (much like the Mongo DB replica set) or install it as a separate instance on each server.
What would be the benefits of the chosen method?
Many thanks!
Option 2.
Briefly.. I would definitely set this up on both servers - giving you two nodes. Given the options you have stated, this will provide the maximum distribution, load balancing, performance and fault tolerance.
Ensure that you manually configure your memory allocation carefully, assigning 50% of the total allocated to heap on each node, and leave the rest to Lucene for indexing.

Resources