nodejs PM2 cluster mode for load balancing - node.js

When using pm2 cluster there's a pretty severe warning saying you shouldn't use it in production, nor for load balancing, use nginx instead. Unfortunately that's exactly how I planned to use PM2. Is it really not intended to be used for that purpose or is it just not completely ready yet?

The nodejs cluster (0.10) has a lot of issues and is not safe to use in production!
You may want to give a try with 0.11, there were some improvements.
This feature has not anything to do with pm2, it's in fact directly related to node cluster module.

Related

Should I use Node.js Greenlock-Express with cluster mode at the same time I'm using pm2 cluster mode?

I'm building an stateless Web application using Node.js, Express and pm2 for process management.
In production environments, I run one instance of the application for each core of the server CPU (thanks to pm2 cluster mode).
Recently I started to read about Greenlock-Express (for obtaining certificates automatically) and it also has a "cluster" property, which if I understand correctly it basically does the same thing as pm2 cluster mode.
Will there be any collisions or possible issues if I run both Greenlock-Express and pm2 in cluster mode? If they do, what's the best alternative to obtain automatically SSL certificates with Node.js in a Windows enviroment? And if they do not, is it optimal to use both of them in cluster mode?
PM2 only implements partial cluster support.
See https://git.rootprojects.org/root/greenlock-express.js/issues/26
I'd recommend just using serviceman (cross-platform) or raw systemd (linux) or Docker (cloud deploys).
If you are going to use PM2, you use Greenlock Express with it the same way that you would use it with Ruby, Python, etc - as a separate executable, not as a "built-in" app.
PM2s default optimizations for node apps are already implemented in Greenlock Express, and since PM2 only has partial cluster support, there's not a way to tell PM2 to pass control to Greenlock Express, nor to have PM2 accept control from Greenlock Express.
Also: Only use cluster mode if you actually have multiple CPU cores otherwise you'll cause thread thrashing and slow down your process.

Deploy node.js in production

What are the best practices for deploying a nodejs application in production?
I would like to know how deploy for production Api's nodejs is being done today, today my application is in docker and running locally.
I wonder if I should use a Nginx inside the container and deploy my server on it or just upload my image node that is already running today.
*I need load balance
There are few main types of deployment that are popular today.
Using platform as a service like Heroku
Using a VPS like AWS, Digital Ocean etc.
Using a dedicated server
This list is in the order of growing difficulty and control. So it's easiest with PaaS but you get more control with a dedicated server - thought it gets significantly more difficult, especially when you need to scale out and build clusters.
See this answer for more details on how to install Node on a VPS or a dedicated server:
how to run node js on dedicated server?
I can only add from experience on AWS using a NAT Gateway which is a dedicated Node server with a MongoDB server behind the gateway. (Obviously this is a scalable system and project.)
With or without Docker, you need to control the production environment. This means clearly defining which NPM libraries you will need for production, how you handle environment variables and clusters for cores.
I would suggest, very strongly, using a tool like PM2 to handle clusters, server shutdowns and restarts and logs. (Workers & slaves also if you need them and code for them).
This list can go on and on, but keep in mind this is only from an AWS perspective. Setting up a Gateway correctly on AWS is also not an easy process. Be prepared for some gotcha's along the way.

Cluster and Fork mode difference in PM2

I've searched a lot to figure out this question, but I didn't get clear explanation. Is there only one difference thing that clustered app can be scaled out and forked app cannot be?
PM2's public site explains Cluster mode can do these feature but no one says about pros of Fork mode (maybe, it can get NODE_APP_INSTANCE variable).
I feel like Cluster might be part of Fork because Fork seems like to be used in general. So, I guess Fork means just 'forked process' from the point of PM2 and Cluster means 'forked process that is able to be scaled out'. Then, why should I use Fork mode?
The main difference between fork_mode and cluster_mode is that it orders pm2 to use either the child_process.fork api or the cluster api.
What does this means internally?
Fork mode
Take the fork mode as a basic process spawning. This allows to change the exec_interpreter, so that you can run a php or a python server with pm2. Yes, the exec_interpreter is the "command" used to start the child process. By default, pm2 will use node so that pm2 start server.js will do something like:
require('child_process').spawn('node', ['server.js'])
This mode is very useful because it enables a lot of possibilities. For example, you could launch multiple servers on pre-established ports which will then be load-balanced by HAProxy or Nginx.
Cluster mode
The cluster will only work with node as it's exec_interpreter because it will access to the nodejs cluster module (eg: isMaster, fork methods etc.). This is great for zero-configuration process management because the process will automatically be forked in multiple instances.
For example pm2 start -i 4 server.js will launch 4 instances of server.js and let the cluster module handle load balancing.
Node.js is single-thread.
That means only 1 core of your Intel quad-core CPU can execute the node application.
It called: fork_mode.
We use it for local dev.
pm2 start server.js -i 0 helps you running 1 node thread on each core of your CPU.
And auto-load-balance the stateless coming requests.
On the same port.
We call it: cluster_mode.
Which is used for the sake of performance on production.
You may also choose to do this on local dev if you want to stress test your PC :)
Documentation and sources are really misleading here.
Reading up on this in the sources, the only differences seems to be, that they use either node cluster or child_process API. Since cluster uses the latter, you are actually doing the same. There is just a lot more custom stdio passing around happening inn fork_mode. Also cluster can only be communicated with via strings, not objects.
By default you are using fork_mode. If you pass the the -i [number]-option, you're going into cluster_mode, which you generally aim for w/ pm2.
Also fork_mode instance probably can't listen on the same port due to EADDRINUSE. cluster_mode can. This way you also can structure you app to run on the same port being automatically load balanced. You have to build apps without state then though e.g. sessions, dbs.

How to Keep a persistent http-server open on AWS instance

I set up a AWS Ubuntu instance running a http-server using node.js
I was wondering if its possible to log out of my remote server while persistently keeping the http-server on.
This is a pretty good tutorial that deals with keeping a node.js server running, and amongst other things, deals with running it in the background.
http://blog.nodejitsu.com/keep-a-nodejs-server-up-with-forever/
Forever is a nice option (as suggested above).
Though, I recommend using AWS' Elastic Beanstalk over EC2 (that's the service you are using now, if I got it right), it provides you an easy interface to deploy you web-server with no ssh interference and keeps it alive at all times after deployment, and also gives you some other load balancing and auto scaling features with minimum effort.
You could also use pm2 for this. Besides keeping your http-server online it also gives you the possibility to do load balancing and other tasks.
Run
npm install pm2 -g
on your server and start your app with
pm2 start app.js
As marekful points out, logging out of your Ubuntu server will not have any effect on your http-server.

What is best node.js cluster?

There is a cluster module in node http://nodejs.org/docs/v0.6.19/api/cluster.html
But I found some other implementations like this one https://github.com/learnboost/cluster
What is the best, who is experienced?
Other question,
Is it necessary to use nginx in production? If so, why? How many simultaneous connections can be handled by single modern multicore server with node, 100K, 200k?
Thanx!
The cluster module from https://github.com/learnboost/cluster is only available for Node v0.2.x and v0.4.x, while the official cluster module is baked into the Node core since v0.6.x. Note that the API will change for v0.8.x (which is around the corner).
So you should use the latest version of Node, with Cluster built in.
NGiNX is faster for serving static files, but other than that I don't see any solid reason to use it. If you want a reverse proxy something like HAProxy is better (or you can use a Node solution like node-http-proxy or bouncy).
Unless you are using a "Hello World" example in production, you cannot accurately predict how many simultaneous connection can be handled. Normally a single Node process can handle thousand of concurrent connections.
Resources:
https://github.com/nodejitsu/node-http-proxy
https://github.com/substack/bouncy

Resources