Setting up websockets in a multi-instance node environment using PM2 - node.js

My current setup is running multiple node instances using PM2 to manage the instances and act as a load balancer.
I would like to implement some functionality using websockets.
The first issue that came to mind is sharing the sockets among X node instances.
My understanding is that if I boot up a websocket-server within a node env only that env will have access to the sockets connected to it.
I do not want to load up web sockets for each instance for each user as that seems like a waste of resources.
Currently I am playing around with the websocket package on npm but I am in no way tied to this if there is a better alternative.
I would like the sockets to more or less push data one-way from server to client and avoid anything coming from the client to the server.
My solution so far is spin up another node instance that solely acts as a websocket server.
This would allow a user to make all requests as normal to the usual instances but make a websocket connection to the separate node instance dedicated to sockets.
The servee could then fire off messages to the dedicated socket server anytime something is updated to send data back to the appropriate clients.
I am not sure this is the best option and I am trying to see if there are other recommended ways of managing websockets across multiple node instances yet still allow me to spin up/down node instances as required.

I'd recommend you avoid a complex setup and just get socket.io working across multiple nodes, thus distributing the load; If you want to avoid data coming from the client to the server, just don't listen to incoming events on your server.
Socket.io supports multiple nodes, under the following conditions:
You have sticky sessions enabled. This ensures requests connect back to the process from which they originated from.
You use a special adapter called socket.io-redis & a small Redis instance as a central point of storage - it keeps track of namespaces/rooms and connected sockets across your cluster of nodes.
Here's an example:
// setup io as usual
const io = require('socket.io')(3000)
// Set a redisAdapter as an adapter.
const redisAdapter = require('socket.io-redis')
io.adapter(redisAdapter({ host: 'localhost', port: 6379 }))
From then on, it's business as usual:
io.emit('hello', 'to all clients')
You can read more here: Socket.IO - Using Multiple Nodes.

Related

WebSocket server clustering with state sharing in Kubernetes deployment

I am looking for a way to cluster WebSocket servers - written in node - so that a proper load balancing and a client request will be served by appropriate node instance. In case of WebSocket, the connection is stateful and I believe a node cluster could help. I want the connection/state information to be shared so that any node instance could serve the request than the client does not need to keep a track of the specific node instance. The reason for this thought process is to ensure that the node instances can be killed and replaced by new instances without bothering about the overheads of state management.
I have a setup where we use multiple instances with load balancers in AWS ECS, deployed by CI/CD pipelines. The number of frontend and backend servers varies between 2 and 8 each depending on bursts and current deployments. If one server crashes, a new one will take its place.
We use socket.io with the Redis adapter to share the websocket state between all connected instances via the in-memory db Redis. This ensures that even if the clients are connected to different instances, they all receive the events.

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Socket.io with pm2cluster

I have an existing application of node.js and socket.io with forever . But now I would like to use, pm2 cluster module instead of forever. But I have been facing some difficulties with socket.io and the cluster instances., as at few places the message is lost. So I read a little online to use another module called socket.io-with-pm2-cluster. Which acts as a plugin. But while using it., it asks me to configure in a way that each instance will have to listen to different ports. Like if the app is running on port 3000, the instance 0,1,23 will have to use 3001,3002,3003,3004. Can anyone suggest if it is the right approach ? Or any other work around s to make this possible ?
I will recommend using socket.io-redis for this purpose which is recommended approach by socket.io. So if you scale to multiple computers in future this will work fine as expected but in the current approach may be its failed to work on multiple computers like in case of AWS, in that case, you can also use a sticky session of LB but this.
socket.io needs to keep the socket open to get events from the server back to the client (and vice-versa) and you are running multiple workers so that is why you getting "few places the message is lost".
Sticky load balancing
If you plan to distribute the load of connections among different
processes or machines, you have to make sure that requests associated
with a particular session id connect to the process that originated
them.
you need to introduce layer that make your service stateless, you can use socket.io-redis
By running socket.io with the socket.io-redis adapter you can run
multiple socket.io instances in different processes or servers that
can all broadcast and emit events to and from each other.
Passing events between nodes
you want to have multiple Socket.IO nodes accepting connections, if you want to broadcast events to everyone (or even everyone in a certain room) you’ll need some way of passing messages between processes or computers.
The interface in charge of routing messages is what we call the Adapter. You can implement your own on top of the socket.io-adapter (by inheriting from it) or you can use the one we provide on top of Redis: socket.io-redis:
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
Then the following call:
io.emit('hi', 'all sockets');
will be broadcast to every node through the Pub/Sub mechanism of Redis.
You can read further details here

Multi threading nodeJS and Socket IO

Okay so multithreading nodeJS isn't much problem from what I've been reading. Just deploy several identical apps and use nginx as reverse proxy and load balancer of all the apps.
But actually native cluster module works pretty well too, I found.
However, what if I have socket.io with the nodeJS app? I have tried the same strategy with nodeJS + socket.IO; however, it obviously did not work because every socket event emitted will be more or less evenly distributed and sockets other than the one that made the connection would have no idea where the request came from.
So best method I can think of right now is to separate nodeJS server and socket.IO server. Scale nodeJS server horizontally (multiple identical apps) but just have one socket.IO server. Although I believe it would be enough for the purpose of our solution, I still need to look out for future. Has anyone succeeded in horizontally scaling Socket.IO? So multiple threads?
The guidelines on the socket.io website use Redis with a package called socket.io-redis
https://socket.io/docs/using-multiple-nodes/
Looks like is just acts like a single pool for the connections, and each node instance connects to that.
Putting your socket server on a separate service (micro-service) is a probably fine, the downside is needing to manage communications between the two instances.

Streaming via socket io with multiple servers from single source

I have a single stream source S that produces ticker data.
I would like to integrate S into my node app that uses socket io. My app runs in a multiple server environment in production, let's say servers A and B.
Initially, I thought I would simply:
Use the socket io redis adapter: https://github.com/socketio/socket.io-redis on both A and B.
Connect both A and B to S and simply have A and B handle the chunks of data emanating from S by simply broadcasting the chunks into the appropriate rooms.
However, after thinking about this, I am realizing that I will probably run into an issue where both A and B broadcast the same data to the client (and the client receives duplicates of the same information). Am I thinking about this correctly? How can I avoid this?
One client must be connected to one server, and the same connection remains open on the same server, it's called session stickyness, he will not have two connections open. In order to do that, you should use a proxy which will act as a load balancer on your pool of server, you can use nginx for example.
All you have to do is synchronise rooms across server to broadcast correctly to all users in a room (because some user will be in a room on server A and other on server B).
documentation about nginx and websockets:
https://www.nginx.com/blog/nginx-nodejs-websockets-socketio/
Hope it helps

Resources