Streaming via socket io with multiple servers from single source - node.js

I have a single stream source S that produces ticker data.
I would like to integrate S into my node app that uses socket io. My app runs in a multiple server environment in production, let's say servers A and B.
Initially, I thought I would simply:
Use the socket io redis adapter: https://github.com/socketio/socket.io-redis on both A and B.
Connect both A and B to S and simply have A and B handle the chunks of data emanating from S by simply broadcasting the chunks into the appropriate rooms.
However, after thinking about this, I am realizing that I will probably run into an issue where both A and B broadcast the same data to the client (and the client receives duplicates of the same information). Am I thinking about this correctly? How can I avoid this?

One client must be connected to one server, and the same connection remains open on the same server, it's called session stickyness, he will not have two connections open. In order to do that, you should use a proxy which will act as a load balancer on your pool of server, you can use nginx for example.
All you have to do is synchronise rooms across server to broadcast correctly to all users in a room (because some user will be in a room on server A and other on server B).
documentation about nginx and websockets:
https://www.nginx.com/blog/nginx-nodejs-websockets-socketio/
Hope it helps

Related

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Socket.io with pm2cluster

I have an existing application of node.js and socket.io with forever . But now I would like to use, pm2 cluster module instead of forever. But I have been facing some difficulties with socket.io and the cluster instances., as at few places the message is lost. So I read a little online to use another module called socket.io-with-pm2-cluster. Which acts as a plugin. But while using it., it asks me to configure in a way that each instance will have to listen to different ports. Like if the app is running on port 3000, the instance 0,1,23 will have to use 3001,3002,3003,3004. Can anyone suggest if it is the right approach ? Or any other work around s to make this possible ?
I will recommend using socket.io-redis for this purpose which is recommended approach by socket.io. So if you scale to multiple computers in future this will work fine as expected but in the current approach may be its failed to work on multiple computers like in case of AWS, in that case, you can also use a sticky session of LB but this.
socket.io needs to keep the socket open to get events from the server back to the client (and vice-versa) and you are running multiple workers so that is why you getting "few places the message is lost".
Sticky load balancing
If you plan to distribute the load of connections among different
processes or machines, you have to make sure that requests associated
with a particular session id connect to the process that originated
them.
you need to introduce layer that make your service stateless, you can use socket.io-redis
By running socket.io with the socket.io-redis adapter you can run
multiple socket.io instances in different processes or servers that
can all broadcast and emit events to and from each other.
Passing events between nodes
you want to have multiple Socket.IO nodes accepting connections, if you want to broadcast events to everyone (or even everyone in a certain room) you’ll need some way of passing messages between processes or computers.
The interface in charge of routing messages is what we call the Adapter. You can implement your own on top of the socket.io-adapter (by inheriting from it) or you can use the one we provide on top of Redis: socket.io-redis:
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
Then the following call:
io.emit('hi', 'all sockets');
will be broadcast to every node through the Pub/Sub mechanism of Redis.
You can read further details here

Setting up websockets in a multi-instance node environment using PM2

My current setup is running multiple node instances using PM2 to manage the instances and act as a load balancer.
I would like to implement some functionality using websockets.
The first issue that came to mind is sharing the sockets among X node instances.
My understanding is that if I boot up a websocket-server within a node env only that env will have access to the sockets connected to it.
I do not want to load up web sockets for each instance for each user as that seems like a waste of resources.
Currently I am playing around with the websocket package on npm but I am in no way tied to this if there is a better alternative.
I would like the sockets to more or less push data one-way from server to client and avoid anything coming from the client to the server.
My solution so far is spin up another node instance that solely acts as a websocket server.
This would allow a user to make all requests as normal to the usual instances but make a websocket connection to the separate node instance dedicated to sockets.
The servee could then fire off messages to the dedicated socket server anytime something is updated to send data back to the appropriate clients.
I am not sure this is the best option and I am trying to see if there are other recommended ways of managing websockets across multiple node instances yet still allow me to spin up/down node instances as required.
I'd recommend you avoid a complex setup and just get socket.io working across multiple nodes, thus distributing the load; If you want to avoid data coming from the client to the server, just don't listen to incoming events on your server.
Socket.io supports multiple nodes, under the following conditions:
You have sticky sessions enabled. This ensures requests connect back to the process from which they originated from.
You use a special adapter called socket.io-redis & a small Redis instance as a central point of storage - it keeps track of namespaces/rooms and connected sockets across your cluster of nodes.
Here's an example:
// setup io as usual
const io = require('socket.io')(3000)
// Set a redisAdapter as an adapter.
const redisAdapter = require('socket.io-redis')
io.adapter(redisAdapter({ host: 'localhost', port: 6379 }))
From then on, it's business as usual:
io.emit('hello', 'to all clients')
You can read more here: Socket.IO - Using Multiple Nodes.

Clustered HTTP server that syncs clients for file transfer

I'm writing a Node HTTP server that essentially only exists for NAT punchthrough. Its job is to facilitate a client sending a file, and another client receiving that file.
Edit: The clients are other Node processes, not browsers. We're using Websockets because some client locations won't allow non-HTTP/S port connections.
The overall process works like this:
All clients keep an open websocket connection.
The receiving client (Alice) tells the server via Websocket that it wants a file from another client (Bob).
The server generates a unique token for this transaction.
The server notifies Alice that it should download the file from /downloads?token=xxxx. Alice connects, and the connection is left open.
The server notifies Bob that it should upload the file to /uploads?token=xxxx. Bob connects and begins uploading the file, since Alice is already listening on the other side.
Once the transfer is complete, both connections are closed.
This is all accomplished by storing references to the HTTP req and res objects inside of a transfers object, indexed by the token. It all works great... as long as I'm not clustering the server.
In the past, when I've used clustering, I've converted the server to be stateless. However, that's not going to work here: I need the req and res objects to be stored in a state so that I can pipe the data.
I know that I could just buffer the transferring data to disk, but I would rather avoid that if at all possible: part of the requirements is that if I buffer anything to disk I must encrypt it, and I'd rather avoid putting the additional load of encrypting/decrypting transient data on the server.
Does anyone have any suggestion on how I could implement this in a way that supports clustering, or a pointer to further research sources I could look at?
Thanks so much!

How make big real time app use socket.io or lightstreamer and scale horizontal

I have got some questions with real time app.
I am making a real time app at this time. I used socket.io,mongodb and nodejs. This app works nice in prototype but what will happen when the number of users increases?
I want to grow horizontal scale.
e.g I have got two server (server A, server B)
client A connect server A
Client B connect server B
How can Client A send message Client B? It has been confusing me with different servers
I found the use redis for this. Is there a possibility that redis-server enough?
As a result, what should I use and which tech(redis,lightstreamer,jabber, socket.io,nginx)?
You can't send directly a message from A to B because they arn't connected to the same server.
The solution to this is to enable communication between the two node server.
You mentionned redis so if you go that route you can have a central redis server that has two lists (on for each server). When client A want to join client B, he send to server A his message. Server A will not find client B in his local sockets and will push to redis the message. Soon or later, server B will collect his pending messages from redis and dispatch them to client B.
It's a basic implementation that you can change to fit your needs. You can have for example a single list of messages per server, but also why not a list per user (and the server which has this
Also as a side note, any central data store such as a database server (mongo? MySQL?) can do the same as redis. It all comes down to what you allready have, what you can have and what type of persistence you want.

Resources