Scaling express/node.js with socket.io horizontally with redis

Scaling express/node.js with socket.io horizontally with redis - node.js

I am trying to scale by express backend. The problem is that every time a user comes in or if I restart the server it gets a new socket.id. Plus I can't save the whole socket into memory because it gives me a [Circular JSON] problem. How do I save somepart of the socket into redis that will allow me retrieve the same socket from other servers?

You need to decouple the user from the socket.id. The socket.id is volatile and can change even with a browser refresh. Instead, when a user socket connects, take a look at the handshake data which is passed and use that to associate the socket to the user. As far as persisting socket data in redis, that can already be handled for you using socket.io-redis.
Take a look at this link for scaling out socket.io: http://socket.io/docs/using-multiple-nodes

Related

How to use socket.io properly with express app

I wonder how do I use socket.io properly with my express app.
I have a REST API written in express/node.js and I want to use socket.io to add real-time feature for my app. Consider that I want to do something I can do just by sending a request to my REST API. What should I do with socket.io? Should I send request to the REST API and send socket.io client the result of the process or handle the whole process within socket.io emitter and then send the result to socket.io client?
Thanks in advance.

Question is not that clear but from what I'm getting from it, is that you want to know what you would use it for that you cant already do with your current API?
The short answer is, well nothing really.. Websockets are just the natural progression of API's and the need for a more 'real-time' interface between systems.
Old methods (and still used and relevant for the right use case) is long polling where you keep checking back to the server for updated items and if so grab them.. This works but it can be expensive in terms of establishing a connection, performing a lookup, then closing a connection.
websockets keep that connection open, allowing both the client and server to communicate real time. So for example, lets say you make an update to your backend data and want users to get that update, using long polling you would rely on each client to ping back to the server, check if there is an update and if so grab it. This can cause lags between updates, some users have updated data while other do not etc.
Now, take the same scenario with websockets, you make an update to the backend data, hit submit, this then emits to your socket server. Socket server takes the call, performs the task ( grabs updated data ) and emits it to the users, each connected user instantly gets that update.
Socket servers are typically used for things like real time chats or polling where packets are smaller but they are also used for web games etc. Depending on the size of your payloads will determine how best to send data back and forth because the larger the payload the more resources / bandwidth it will take on the socket server so its something to consider.

how to access socket session in all clusters

I am working on setting up socket.io in cluster mode using PM2.
I am using socket.io-redis package and it works fine in cluster mode.
But the problem arises when I want to access all connected sockets. Because processes don't know about socket connections in other processes in cluster mode.
I thought socket.io-redis keeps track of all the connected sockets and all its session info but it didn't.
Is there any way or solution to access all the socket connection existing in all processes in socket.io/Nodejs?

Socket.io-redis does keep track in a sense..
From their docs
"The Redis adapter extends the broadcast function of the in-memory adapter: the packet is also published to a Redis channel (see below for the format of the channel name).
Each Socket.IO server receives this packet and broadcasts it to its own list of connected sockets."
So basically, redis is used as the broker to tell each socket server to emit based on X channel etc. Allowing you to have a socket.io server in cluster mode work, but as you have mentioned it can fall short when you need to keep track of things outside of just an emit.
So where does this leave us.. Well you can use custom hooks via socket.io-redis but personally I found it to be really difficult to understand and use and had limited success personally. I think with the new version of socket.io and socket.io redis there were some tweaks to make this simpler however I have not tried them.
Instead, what we do is use redis hset and jget to store the socket and an ID of a users, then when we want to get all users online we can query redis to get the list of online users or users in a specific room etc.
What you will want to do is add the redis package and connect in additon to the regular pub / sub.
Then, when a user joins a room or your server for that matter you will do an hset. On the first join ours looks something like this
redis.hset([collection-name],[Field],[value])
So in code it looks like
redis.hset(decoded.cID,"socket-" + socket.id,socket.nickname)
This will set a value in redis, so the collection name is a value ( for us its a unique id of the channel ) then we stock the 'socket.id' for the Field along with a 'nick-name' for the value. This value is the users ID OR its anonymous if they are not logged in
Then, when we want to grab who is in a room we use the hget command
redis.HGETALL([collection-name],function(err,results){}
So inside of say the emit, we call the redis.HGETALL command to get all items inside a specific collection that we pass in and send that back to all connected users.

Using on() or once() for `open` event in MongooseJS

I've seen a bunch of different ways of actually setting up a MongoDB connection:
I've seen some code where people don't use the open or error event
mongoose.connection.on('open', callback());
mongoose.connection.once('open', callback());
My take on it is:
If my app only connected to the database when it needs to use it, use (2)
If my app is constantly connected to the database ... it doesn't matter if I use (2) or (3)?
Which also raises the question, should my app maintain a persistent connection to the database (server and database running on same machine)?
Thanks for any help

You are correct that it doesn't matter if you use (2) or (3) when your application is constantly connected to the database.
As far as a persistent connection goes, the only cost of that is a tcp keepalive packet every once in a while. It's up to you to determine if the extra socket is worth not having to make a new connection for every call.

Scaling NodeJS chat room - store connection status in Redis

The chat room app are running on multiple servers and consists two services:
1. connection manager
Before joining the chat room, client ask for a chat service url from connection manager first
2. chat service
A typical socket.io based chat implementation.
I need to store each client's connection status in Redis, such as user connect to which room, how many users are in one room etc. So the connection manager can use the data to do load balancing.
I can use socket connection/disconnect event to maintain the current connection status in Redis, but in case of NodeJS server failure, how to make sure Node and Redis data are synchronized? What's the best way to do this?

I can use socket connection/disconnect event to maintain the current
connection status in Redis, but in case of NodeJS server failure, how
to make sure Node and Redis data are synchronized?
For example you can create a set in redis which would contain reference to keys that are managed by specific server node. If a node goes down or is restarted you can invalidate these keys.

Retrieve Socket.io Client from Redis

I'm building a real time data system that allows an Apache/PHP server to send data to my Node.js server, which will then immediately send that data to the associated client via socket.io. So the Apache/PHP server makes a request that includes the data, as well as a user token that tells Node.js which user to send the data to.
Right now this is working fine - I've got an associative array that ties the user's socket.io connection to their user token. The problem is that I need to start scaling this to multiple servers. Naturally, with the default configs of socket.io I can't share connections between node workers.
The solution I had in mind was to use the RedisStore functionality, and just have each of my workers looking at the same Redis store. I've been doing research and there's a lot of documentation on how to use pub/sub functionality for broadcasting messages to large groups (rooms). That's fine, but I need to be able to send messages to a single client, so I need some way to retrieve a user's socket.io connection from the RedisStore.
The only way I can think to do this right now is to create a ton of 'rooms' named with the user's token, and only have one user in each room. Then I could just emit to that room. However, that seems very inefficient.
Is there a better way that I can retrieve user's unique socket.io connections from Redis?

Once a socket connection is made to a server running the node server, it is connected to that instance.
So it seems you need to make a way for your php server to know which node server a client is connected to.
In your redis store you could just store the id of the server as the value by the client id. Then php looks up which node server to use and makes the request.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string