How to implement PM2 clustering with WebSockets, Nodejs - node.js

I am writing real time connection web server with WebSocket(ws) library in Node js(Express).so when I am running server on pm2 cluster mode than websocket not working properly.Is there any way to make share data between pm2 clusters with websocket.

From what I understand, that's kind of the point of cluster mode. It wouldn't be able to keep the connection alive across multiple "nodes" running processes. Straight from the PM2 docs:
Be sure your application is stateless meaning that no local data is stored in the process, for example sessions/websocket connections, session-memory and related. Use Redis, Mongo or other databases to share states between processes.
If you could pull out your websocket into its own instance and avoid side effects from a pm2 cluster--that's probably the most performant, although maybe not feasible in your set up. Best of luck!

You can switch the library to socket.io they already solve this issue by using redis adapter for socket.io Socket.IO Redis
By running socket.io with the socket.io-redis adapter you can run multiple socket.io instances in different processes or servers that can all broadcast and emit events to and from each other.

Related

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Socket.io with pm2cluster

I have an existing application of node.js and socket.io with forever . But now I would like to use, pm2 cluster module instead of forever. But I have been facing some difficulties with socket.io and the cluster instances., as at few places the message is lost. So I read a little online to use another module called socket.io-with-pm2-cluster. Which acts as a plugin. But while using it., it asks me to configure in a way that each instance will have to listen to different ports. Like if the app is running on port 3000, the instance 0,1,23 will have to use 3001,3002,3003,3004. Can anyone suggest if it is the right approach ? Or any other work around s to make this possible ?
I will recommend using socket.io-redis for this purpose which is recommended approach by socket.io. So if you scale to multiple computers in future this will work fine as expected but in the current approach may be its failed to work on multiple computers like in case of AWS, in that case, you can also use a sticky session of LB but this.
socket.io needs to keep the socket open to get events from the server back to the client (and vice-versa) and you are running multiple workers so that is why you getting "few places the message is lost".
Sticky load balancing
If you plan to distribute the load of connections among different
processes or machines, you have to make sure that requests associated
with a particular session id connect to the process that originated
them.
you need to introduce layer that make your service stateless, you can use socket.io-redis
By running socket.io with the socket.io-redis adapter you can run
multiple socket.io instances in different processes or servers that
can all broadcast and emit events to and from each other.
Passing events between nodes
you want to have multiple Socket.IO nodes accepting connections, if you want to broadcast events to everyone (or even everyone in a certain room) you’ll need some way of passing messages between processes or computers.
The interface in charge of routing messages is what we call the Adapter. You can implement your own on top of the socket.io-adapter (by inheriting from it) or you can use the one we provide on top of Redis: socket.io-redis:
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
Then the following call:
io.emit('hi', 'all sockets');
will be broadcast to every node through the Pub/Sub mechanism of Redis.
You can read further details here

Setting up websockets in a multi-instance node environment using PM2

My current setup is running multiple node instances using PM2 to manage the instances and act as a load balancer.
I would like to implement some functionality using websockets.
The first issue that came to mind is sharing the sockets among X node instances.
My understanding is that if I boot up a websocket-server within a node env only that env will have access to the sockets connected to it.
I do not want to load up web sockets for each instance for each user as that seems like a waste of resources.
Currently I am playing around with the websocket package on npm but I am in no way tied to this if there is a better alternative.
I would like the sockets to more or less push data one-way from server to client and avoid anything coming from the client to the server.
My solution so far is spin up another node instance that solely acts as a websocket server.
This would allow a user to make all requests as normal to the usual instances but make a websocket connection to the separate node instance dedicated to sockets.
The servee could then fire off messages to the dedicated socket server anytime something is updated to send data back to the appropriate clients.
I am not sure this is the best option and I am trying to see if there are other recommended ways of managing websockets across multiple node instances yet still allow me to spin up/down node instances as required.
I'd recommend you avoid a complex setup and just get socket.io working across multiple nodes, thus distributing the load; If you want to avoid data coming from the client to the server, just don't listen to incoming events on your server.
Socket.io supports multiple nodes, under the following conditions:
You have sticky sessions enabled. This ensures requests connect back to the process from which they originated from.
You use a special adapter called socket.io-redis & a small Redis instance as a central point of storage - it keeps track of namespaces/rooms and connected sockets across your cluster of nodes.
Here's an example:
// setup io as usual
const io = require('socket.io')(3000)
// Set a redisAdapter as an adapter.
const redisAdapter = require('socket.io-redis')
io.adapter(redisAdapter({ host: 'localhost', port: 6379 }))
From then on, it's business as usual:
io.emit('hello', 'to all clients')
You can read more here: Socket.IO - Using Multiple Nodes.

Multi threading nodeJS and Socket IO

Okay so multithreading nodeJS isn't much problem from what I've been reading. Just deploy several identical apps and use nginx as reverse proxy and load balancer of all the apps.
But actually native cluster module works pretty well too, I found.
However, what if I have socket.io with the nodeJS app? I have tried the same strategy with nodeJS + socket.IO; however, it obviously did not work because every socket event emitted will be more or less evenly distributed and sockets other than the one that made the connection would have no idea where the request came from.
So best method I can think of right now is to separate nodeJS server and socket.IO server. Scale nodeJS server horizontally (multiple identical apps) but just have one socket.IO server. Although I believe it would be enough for the purpose of our solution, I still need to look out for future. Has anyone succeeded in horizontally scaling Socket.IO? So multiple threads?
The guidelines on the socket.io website use Redis with a package called socket.io-redis
https://socket.io/docs/using-multiple-nodes/
Looks like is just acts like a single pool for the connections, and each node instance connects to that.
Putting your socket server on a separate service (micro-service) is a probably fine, the downside is needing to manage communications between the two instances.

Socket.io (node.js) on an EC2 array of servers; alternative to redis store for discovery and persistency?

I'm running socket.io/node.js on my EC2 auto-scaling array of servers. As soon as I have more than 1 server, there is the obvious problem where the socket.io connections need to be shared between the servers. This is where you'd normally use the redis store plugin that comes with socket.io.
Unfortunately, I use MongoDB (not redis), and the cost of adding another database to my stack would be prohibitively high. I tried using the socket.io-mongo module, but it absolutely killed my servers (huge CPU usage).
Is there any other way to share the socket.io session between EC2 servers?

Resources