Multiple websockets onto multiple servers: how do they communicate? - node.js

I have a node server accepting websocket connections from the clients. Each client can broadcast a message to all of the other clients.
UPDATE: I am using https://github.com/websockets/ws as my library of choice.
At the moment, the server has an array with all of the connections. Each connection has a tabId. When one of the client emits a message, I go through all of the connections and check: if the connection's tabId doesn't match, I send the message to the client.
For loading issues, I am facing the problem of having to have more than one server. So, there will be say two servers, each one with a number of clients.
How do I make sure that a message gets broadcast to all of the websocket clients, and not only the ones connected to the same server?
One possible solution I thought is to have the connections stored on a database, where each record has the tabId and the serverId. However, even a simple broadcast gets tricky as messages to "local" sockets are easy to broadcast (the socket is local and available) whereas messages to "remote" sockets are tricky, and would imply intra-server communication.
Is there a good pattern to solve this? Surely, this is something that people face every day.

You could use a messagequeue like RabbitMQ.
When a client logs in to your server, create a consumer which listens to a queue which will receive messages directed to that particular client. And when the clients are sending messages, just use a publisher to publish them to the recipients queue.
This way it doesn't matter and you don't need to know on which nodes the clients are on, or if they jump from a node to another.

Related

Chat / System Communication App (Nodejs + RabbitMQ)

So i currently have a chat system running NodeJS that passes messages via rabbit and each connected user has their own unique queue that subscribed and only listening to messages (for only them). The backend can also use this chat pipeline to communicate other system messages like notifications/friend requests and other user event driven information.
Currently the backend would have to loop and publish each message 1 by 1 per user even if the payload of the message is the same for let's say 1000 users. I would like to get away from that and be able to send the same message to multiple different users but not EVERY user who's connected.
(example : notifying certain users their friend has come online).
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
Proposed - infrastructure
This way the backend does not need to loop for 100s and 1000s of users and can send a single payload containing all users this message should go to. I do plan to cluster the nodejs servers together.
I was also wondering since ive never done this in a production environment, will i need to track each socketID.
Potential pitfalls i've identified so far:
slower since 1000s of messages can pile up in a single queue.
manually storing socket IDs to manually trasmit to users.
offloading routing to NodeJS instead of RabbitMQ
Has anyone done anything like this before? If so, what are your recommendations. Is it better to scale with user unique queues, or pool all grouped messages for all users into smaller (but larger pools) of queues.
as a general rule, queue-per-user is an anti-pattern. there are some valid uses of this, but i've never seen it be a good idea for a chat app (in spite of all the demos that use this example)
RabbitMQ can be a great tool for facilitating the delivery of messages between systems, but it shouldn't be used to push messages to users.
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
this is heading down the right direction, but you have to remember that RabbitMQ is not a database (see previous link, again).
you can't randomly seek specific messages that are sitting in the queue and then leave them there. they are first in, first out.
in a chat app, i would have rabbitmq handling the message delivery between your systems, but not involved in delivery to the user.
your thoughts on using web sockets are going to be the direction you want to head for this. either that, or Server Sent Events.
if you need persistence of messages (history, search, last-viewed location, etc) then use a database for that. keep a timestamp or other marker of where the user left off, and push messages to them starting at that spot.
you're concerns about tracking sockets for the users are definitely something to think about.
if you have multiple instances of your node server running sockets with different users connected, you'll need a way to know which users are connected to which node server.
this may be a good use case for rabbitmq - but not in a queue-per-user manner. rather, in a binding-per-user. you could have each node server create a queue to receive messages from the exchange where messages are published. the node server would then create a binding between the exchange and queue based on the user id that is logged in to that particular node server
this could lead to an overwhelming number of bindings in rmq, though.
you may need a more intelligent method of tracking which server has which users connected, or just ignore that entirely and broadcast every message to every node server. in that case, each server would publish an event through the websocket based on the who the message should be delivered to.
if you're using a smart enough websocket library, it will only send the message to the people that need it. socket.io did this, i know, and i'm sure other websocket libraries are smart like this, as well.
...
I probably haven't given you a concrete answer to your situation, and I'm sure you have a lot more context to consider. hopefully this will get you down the right path, though.

Sending data from RabbitMQ to Node.JS via Socket.IO

I am going to design a system where there is a two-way communication between clients and a web application. The web application can receive data from the client so it can persist it to a DB and so forth, while it can also send instructions to the client. For this reason, I am going to use Node.JS and Socket.IO.
I also need to use RabbitMQ since I want that if the web application sends an instruction to a client, and the client is down (hence the socket has dropped), I want it to be queued so it can be sent whenever the client connects again and creates a new socket.
From the client to the web application it should be pretty straightforward, since the client uses the socket to send the data to the Node.JS app, which in turn sends it to the queue so it can ultimately be forwarded to the web application. From this direction, if the socket is down, there is no internet connection, and hence the data is not sent in the first place, or is cached on the client.
My concern lies with the other direction, and I would like an answer before I design it this way and actually implement it, so I can avoid hitting any brick walls. Let's say that the web application tries to send an instruction to the client. If the socket is available, the web app forwards the instruction to the queue, which in turn forwards it to the Node.JS app, which in turn uses the socket to forward it to the client. So far so good. If on the other hand, the internet connection from the client has dropped, and hence the socket is currently down, the web app will still send the instruction to the queue. My question is, when the queue forwards the instruction to Node.JS, and Node.JS figures out that the socket does not exist, and hence cannot send the instruction, will the queue receive a reply from Node.JS that it could not forward the data, and hence that it should remain in the queue? If that is the case, it would be perfect. When the client manages to connect to the internet, it will perform a handshake once again, the queue will once again try to send to Node.JS, only this time Node.JS manages to send the instruction to the client.
Is this the correct reasoning of how those components would interact together?
this won't work the way you want it to.
when the node process receives the message from rabbitmq and sees the socket is gone, you can easily nack the message back to the queue.
however, that message will be processed again immediately. it won't sit there doing nothing. the node process will just pick it up again. you'll end up with your node / rabbitmq thrashing as it just nacks a message over and over and over and over, waiting for the socket to come back online.
if you have dozens or hundreds of messages for a client that isn't connected, you'll have dozens or hundreds of messages thrashing round in circles like this. it will destroy the performance of both your node process and rabbitmq.
my recommendation:
when the node app receives the message from rabbitmq, and the socket is not available to the client, put the message in a database table and mark it as waiting for that client.
when the client re-connects, check the database for any pending messages and forward them all at that point.

NodeJS Synchronize clients

I'm using socket.io and nodejs,
I have a server and I use it as my nodeJS server. What I'm trying to do is moving clients according to messages sent as client -> server -> clients
For example; client1 sending a message "MOVE-RIGHT" to server. Server redirecting this message to all clients LIKE "MOVE-RIGHT-CLIENT1" and according to this message, all clients starting to move client1 to the right direction.
The problem is, all clients may have different latency according to their network status. For example, if server->client1 communication happens in 50 ms, server->client2 communication may happen in 250 ms. Therefore, client1 does this job nearly 200 ms earlier. So we can say that these two movements are not synchronized because one of them happens earlier than other ones.
As you know latency between clients and server may be different for each clients, and also it can be different for each message for the same client.
My question is, Which method should I use to synchronize these clients, to do their jobs at the same time. Is there any feature of socket.io or nodejs about this? What would you recommend for me?

Synchronisation: Client, Server Chat

I am writing a Client, Server-based chat. The Server is the central component and handles all the incoming messages and outgoing messages. The clients are that chat users. They see the chat in a frame and can also write chat messages. These messages are sent over to the server. The server in turn updates all clients.
My problem is synchronisation of the clients. Since the server is multi-threaded, both messages can be received from clients and updates (in form of messages) have to be sent out aswell. Since each client is getting updated in in its own thread, there is no guarantee that all clients will receive the same messages. We have a snychronisation problem.
How do I solve it?
I have messed with timestamps and a buffer. But this is not a good solution again because there is no guarantee that after assigning a timestamp the message will be put into the buffer immediately afterwards.
I shall add that I do not know the clients. That is, I only have one open connection in each thread on the server. I do not have an array of clients or something like that to keep track of all the clients.
I suggest that you implement a queue for each client proxy (that's the object that manages the communication with each client).
Each iteration of your server object's (on its own thread) work:
1. It reads messages from the queues of all client proxies first
2. Decides if it needs to send out any messages based on its internal logic and incoming messages
3. Prepares and puts any outgoing messages to the queues of all its client proxies.
The client proxy thread work schedule is this:
1. Read from the communication.
2. Write to the queue from client proxy to server (if received any messages).
3. Read from the queue from server to client proxy.
4. Write to communication channel to client (if needed).
You may have to have a mutex on each queue.
Hope that helps

How does socket.io send messages across multiple servers?

The Socket.io API has the ability to send messages to all clients.
With one server and all sockets in memory, I understand how that server one can send a message to all its clients, that's pretty obvious. But what about with multiple servers using Redis to store the sockets?
If I have client a connected to server y and client b connected to server z (and a Redis box for the store) and I do socket.broadcast.emit on one server, the client on the other server will receive this message. How?
How do the clients that are actually connected to the other server get that message?
Is one server telling the other server to send a message to its connected client?
Is the server establishing its own connection to the client to send that message?
Socket.io uses MemoryStore by default, so all the connected clients will be stored in memory making it impossible (well, not quiet but more on that later) to send and receive events from clients connected to a different socket.io server.
One way to make all the socket.io servers receive all the events is that all servers use redis's pub-sub. So, instead using socket.emit one can publish to redis.
redis_client = require('redis').createClient();
redis_client.publish('channelName', data);
And all the socket servers subscribe to that channel through redis and upon receiving a message emit it to clients connected to them.
redis_sub = require('redis').createClient();
redis_sub.subscribe('channelName', 'moreChannels');
redis_sub.on("message", function (channel, message) {
socket.emit(channel, message);
});
Complicated Stuff !! But wait, turns out you dont actually need this sort of code to achieve the goal. Socket.io has RedisStore which essentially does what the code above is supposed to do in a nicer way so that you can write Socket.io code as you would write for a single server and will still get propagated over to other socket.io server through redis.
To summarise socket.io sends messages across multiple servers by using redis as the channel instead of memory.
There are a few ways you can do this. More info in this question. A good explanation of how pub/sub in Redis works is here, in Redis' docs. An explanation of how the paradigm works in general is here, on Wikipedia.
Quoting the Redis docs:
SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe
messaging paradigm where (citing Wikipedia) senders (publishers) are
not programmed to send their messages to specific receivers
(subscribers). Rather, published messages are characterized into
channels, without knowledge of what (if any) subscribers there may be.
Subscribers express interest in one or more channels, and only receive
messages that are of interest, without knowledge of what (if any)
publishers there are. This decoupling of publishers and subscribers
can allow for greater scalability and a more dynamic network topology.

Resources