Scaling NodeJS chat room - store connection status in Redis - node.js

The chat room app are running on multiple servers and consists two services:
1. connection manager
Before joining the chat room, client ask for a chat service url from connection manager first
2. chat service
A typical socket.io based chat implementation.
I need to store each client's connection status in Redis, such as user connect to which room, how many users are in one room etc. So the connection manager can use the data to do load balancing.
I can use socket connection/disconnect event to maintain the current connection status in Redis, but in case of NodeJS server failure, how to make sure Node and Redis data are synchronized? What's the best way to do this?

I can use socket connection/disconnect event to maintain the current
connection status in Redis, but in case of NodeJS server failure, how
to make sure Node and Redis data are synchronized?
For example you can create a set in redis which would contain reference to keys that are managed by specific server node. If a node goes down or is restarted you can invalidate these keys.

Related

Using Node.js and Mongoose: Is switching between multiple connections on a MongoDB server an inefficient alternative?

I have several clients. Each client has multiple data acquisition stations. Each station has multiple sensors. I need to store this data on a MongoDB server (using mongoose and Node.js). So, I thought about the organization that follows.
Each client has its own database inside the MongoDB server.
Within this database, each station has its own collection.
Data is sent to the MongoDB server via an MQTT broker (Node.js). So, depending on which client the broker receives the message from, I need to create a new connection to the MongoDB server (using mongoose.createConnection).
I'm not sure if this is the best alternative. I don't know if creating multiple different connections will slow down the system.
What do you think?

Handling reconnections in the socket.io server?

When the socket.io client performs an (automatic) reconnection - as might happen if a mobile client went to sleep then woke up again - does the server get a reconnect event? Or does it just see a disconnection and fresh connection?
In either case is there a way to
identify that it's the same client e.g. by a unique client id that persists across connections
have the client automatically re-join any rooms it was in before
Or do I need to code that functionality manually e.g. by having the client supply the id or rooms itself on reconnection?
I had a read of the socket.io docs and can't see any list of events that the server might receive.

Redis Pub/Sub Scaling

I've been rewriting my nodejs game app to read memory from redis so I could ultimately scale it if it were to ever grow large. But, I stopped because I feel like I am going about it the wrong way. For example:
Server 1 is on port 9300
Server 2 is on port 9301
Now, let's say a player from the Server 1 wants to send a private message to a player that is on Server 2.
What I currently do
Server 2 will send a publish signal to redis, and Server 1 will catch that signal as well, and if that user is on that server, it will send them a notification along with the message.
Some questions
1) Wouldn't it be more appropiate to just have Server 2 send a message to Server 1 without publishing to redis?
2) Server 2 doesn't keep a track of all the connected clients that Server 1 has, so it wouldn't be possible. Unless, I keep track of all connected clients on every server aswell? That would require the client to connect to multiple servers on each visit.
3) Let's say I have 10 servers. A user on Server 5 wants to send a private message to a user on Server 1. If I send a pub signal through redis, Servers 4,6,7,8,9,2,3 will all receive that signal as well... which is unneeded? Is that when Peer to Peer connections come into play? Or is that the extra bandwidth that is required for scaling and I'm overthinking everything?
Perhaps you could change concept of your app by adding RabbitMQ instead of redis pub/sub. RMQ would allow smarter message routing.
Basicly each user can listen its own messages:
User connects to server (1-x)
Server subscribes to RMQ exchange messages with user routing key
When user publishes private message it's sent to provided exchange with specific user routing key
Still that concept will hit the limit if you get large number of users. Connection count to RMQ server will grow intensivly. In that case you can scale RMQ or change connection concept:
Save all user connection info to redis
When user sends private message to some other user first find on which server instance user is
Send message only to specific server instance (fetched from user connection information)

how to send a message to individual clients with socket.io with multiple server processes?

I'm about to begin with socket.io and this is more of a theoretical question,
let's say that I want to send a message to a specific user with socket.io,
normally I would have to store the socketid with the relevant userid and when sending, get the socketid and send to.
but what if I have mutliple server processes running ? I'll have to make sure the correct server that the client is actually connected to does the sending. is it possible ?
For multiple server instances, you need to have a caching service (memcache, redis) for authentication and a central message queue service (stormMQ, rabbitMQ, AQ, java-based mq) where all your node instances bind to. Thus, a Node instance binds to the message queue for each client / channel / whatever, and all the other bound Node instances receive the messages and forward them to the client.
The problem is typically about how to play with a WebSocket cluster:
Several front-end servers which will be in charge of handling bidirectional connections with each client. They form the WebSocket cluster.
Several back-end servers which will be in charge of handling the business logic of your application.
Each time the back-end wants to inform the client, it will send a request to the WebSocket cluster which has the responsibility to communicate with the client.
A possible scenario:
Identify each WebSocket cluster's server with a unique id.
Identify each client with a unique id.
Each time a client will connect one of your WebSocket cluster's server, store its unique id along with the server's unique id in a a distributed key/value like database.
Thus you know which client is connected with which server.
The next time your back-end application wants to notify a client there are two possibilities:
The pair (clientId, serverId) is not present in the database and you cannot inform the client.
The pair (clientId, serverId) is present in the database, then you have to ask to the server identified by serverId to notify the client identified by clientId.
Notes:
Each WebSocket cluster's server can run a node.js instance supercharged with socket.io. It has to provide a route which will take the clientId as a parameter and will use socket.io to notify this client. Indeed, socket.io is aware of whcih client is using which socket on this server.
Every time your server will crash, you have to clean your database and remove all pairs which contain the server id.
Deploying a WebSocket cluster can be tedious, so you have commercial offers like Kaazing.
A good distributed key/value like database is Riak. It is better than Redis or Memcached for the above purpose because it can be easily distributed in a data-center and over several data-centers.

Retrieve Socket.io Client from Redis

I'm building a real time data system that allows an Apache/PHP server to send data to my Node.js server, which will then immediately send that data to the associated client via socket.io. So the Apache/PHP server makes a request that includes the data, as well as a user token that tells Node.js which user to send the data to.
Right now this is working fine - I've got an associative array that ties the user's socket.io connection to their user token. The problem is that I need to start scaling this to multiple servers. Naturally, with the default configs of socket.io I can't share connections between node workers.
The solution I had in mind was to use the RedisStore functionality, and just have each of my workers looking at the same Redis store. I've been doing research and there's a lot of documentation on how to use pub/sub functionality for broadcasting messages to large groups (rooms). That's fine, but I need to be able to send messages to a single client, so I need some way to retrieve a user's socket.io connection from the RedisStore.
The only way I can think to do this right now is to create a ton of 'rooms' named with the user's token, and only have one user in each room. Then I could just emit to that room. However, that seems very inefficient.
Is there a better way that I can retrieve user's unique socket.io connections from Redis?
Once a socket connection is made to a server running the node server, it is connected to that instance.
So it seems you need to make a way for your php server to know which node server a client is connected to.
In your redis store you could just store the id of the server as the value by the client id. Then php looks up which node server to use and makes the request.

Resources