I am going to design a system where there is a two-way communication between clients and a web application. The web application can receive data from the client so it can persist it to a DB and so forth, while it can also send instructions to the client. For this reason, I am going to use Node.JS and Socket.IO.
I also need to use RabbitMQ since I want that if the web application sends an instruction to a client, and the client is down (hence the socket has dropped), I want it to be queued so it can be sent whenever the client connects again and creates a new socket.
From the client to the web application it should be pretty straightforward, since the client uses the socket to send the data to the Node.JS app, which in turn sends it to the queue so it can ultimately be forwarded to the web application. From this direction, if the socket is down, there is no internet connection, and hence the data is not sent in the first place, or is cached on the client.
My concern lies with the other direction, and I would like an answer before I design it this way and actually implement it, so I can avoid hitting any brick walls. Let's say that the web application tries to send an instruction to the client. If the socket is available, the web app forwards the instruction to the queue, which in turn forwards it to the Node.JS app, which in turn uses the socket to forward it to the client. So far so good. If on the other hand, the internet connection from the client has dropped, and hence the socket is currently down, the web app will still send the instruction to the queue. My question is, when the queue forwards the instruction to Node.JS, and Node.JS figures out that the socket does not exist, and hence cannot send the instruction, will the queue receive a reply from Node.JS that it could not forward the data, and hence that it should remain in the queue? If that is the case, it would be perfect. When the client manages to connect to the internet, it will perform a handshake once again, the queue will once again try to send to Node.JS, only this time Node.JS manages to send the instruction to the client.
Is this the correct reasoning of how those components would interact together?
this won't work the way you want it to.
when the node process receives the message from rabbitmq and sees the socket is gone, you can easily nack the message back to the queue.
however, that message will be processed again immediately. it won't sit there doing nothing. the node process will just pick it up again. you'll end up with your node / rabbitmq thrashing as it just nacks a message over and over and over and over, waiting for the socket to come back online.
if you have dozens or hundreds of messages for a client that isn't connected, you'll have dozens or hundreds of messages thrashing round in circles like this. it will destroy the performance of both your node process and rabbitmq.
my recommendation:
when the node app receives the message from rabbitmq, and the socket is not available to the client, put the message in a database table and mark it as waiting for that client.
when the client re-connects, check the database for any pending messages and forward them all at that point.
Related
I am looking for a solution to my problem. I have Node.js server serving my web application where user can log in. I want to handle a situation where one user A performs specific action and user B associated with this action gets real life notification. Is there a module that would help me or there is some other solution?
What you are describing is "server push" where the server proactively notifies a user on their site of some activity or event. In the web browser world these days, there are basically two underlying technology options:
webSocket (or some use socket.io, a more feature rich library built on top of webSocket)
server sent events (SSE).
For webSocket or socket.io, the basic idea is that the web page connects back to the server with a webSocket or socket.io connection. That connection stays live (unlike a typical http connection that would connect, send a request, receive a response, then close the connection). So, with that live connection, the server is free to send the client (which is the web page in a user's browser), notifications at any time. The Javascript in the web page then listens for incoming data on the connection and, based on what data it receives, then uses Javascript to update the currently displayed web page to show something to the user.
For server sent events, you open an event source on the client-side and that also creates a lasting connection to the server, but this connection is one-way only (the server can send events to the client) and it's completely built on HTTP. This is a newer technology than webSocket, but is more limited in purpose.
In both of these cases, the server has to keep track of which connection belongs to which user so when something interesting happens on the server, it can know which connection to notify of the event.
Another solution occasionally used is client-side polling. In this case, the web page just regularly sends an ajax call to the server asking if there are any new events. Anything new yet? Anything new yet? Anything new yet? While this is conceptually a bit simpler, it's typically far less efficient unless the polling intervals are spaced far apart, say 10 or 15 minutes which limits the timeliness of any notifications. This is because most polling requests (particularly when done rapidly) return no data and are just wasted cycles on your server.
If you want to notify userB, when both of you are simultaneously online during the action, then use websockets to pass message to a two-way channel to notify userB.
If you want to notify them whenever, regardless of online status, use a message queue.
I have a node server accepting websocket connections from the clients. Each client can broadcast a message to all of the other clients.
UPDATE: I am using https://github.com/websockets/ws as my library of choice.
At the moment, the server has an array with all of the connections. Each connection has a tabId. When one of the client emits a message, I go through all of the connections and check: if the connection's tabId doesn't match, I send the message to the client.
For loading issues, I am facing the problem of having to have more than one server. So, there will be say two servers, each one with a number of clients.
How do I make sure that a message gets broadcast to all of the websocket clients, and not only the ones connected to the same server?
One possible solution I thought is to have the connections stored on a database, where each record has the tabId and the serverId. However, even a simple broadcast gets tricky as messages to "local" sockets are easy to broadcast (the socket is local and available) whereas messages to "remote" sockets are tricky, and would imply intra-server communication.
Is there a good pattern to solve this? Surely, this is something that people face every day.
You could use a messagequeue like RabbitMQ.
When a client logs in to your server, create a consumer which listens to a queue which will receive messages directed to that particular client. And when the clients are sending messages, just use a publisher to publish them to the recipients queue.
This way it doesn't matter and you don't need to know on which nodes the clients are on, or if they jump from a node to another.
On the Server side for websockets there is already an ping/pong implementation where the server sends a ping and client replies with a pong to let the server node whether a client is connected or not. But there isn't something implemented in reverse to let the client know if the server is still connected to them.
There are two ways to go about this I have read:
Every client sends a message to server every x seconds and whenever
an error is thrown when sending, that means the server is down, so
reconnect.
Server sends a message to every client every x seconds, the client receives this message and updates a variable on the client, and on the client side you have a thread that constantly checks every x seconds which checks if this variable has changed, if it hasn't in a while it means it hasn't received a message from the server and you can assume the server is down so reestablish a connection.
You can achieve trying to figure out on client side whether the server is still online using either methods. The first one you'll be sending traffic to the server whereas the second one you'll be sending traffic out of the server. Both seem easy enough to implement but I'm not so sure which is the better way in terms of being the more efficient/cost effective.
Server upload speeds are higher than client upload speeds, but server CPUs are an expensive resource while client CPUs are relatively cheap. Unloading logic onto the client is a more cost-effective approach...
Having said that, servers must implement this specific logic (actually, all ping/timeout logic), otherwise they might be left with "half-open" sockets that drain resources but aren't connected to any client.
Remember that sockets (file descriptors) are a limited resource. Not only do they use memory even when no traffic is present, but they prevent new clients from connecting when the resource is maxed out.
Hence, servers must clear out dead sockets, either using timeouts or by implementing ping.
P.S.
I'm not a node.js expert, but this type of logic should be implemented using the Websocket protocol ping rather than by your application. You should probably look into the node.js server / websocket framework and check how to enable ping-ing.
You should set pings to accommodate your specific environment. i.e., if you host on Heroku, than Heroku will implement a timeout of ~55 seconds and your pings should be sent before this timeout occurs.
I am writing a Client, Server-based chat. The Server is the central component and handles all the incoming messages and outgoing messages. The clients are that chat users. They see the chat in a frame and can also write chat messages. These messages are sent over to the server. The server in turn updates all clients.
My problem is synchronisation of the clients. Since the server is multi-threaded, both messages can be received from clients and updates (in form of messages) have to be sent out aswell. Since each client is getting updated in in its own thread, there is no guarantee that all clients will receive the same messages. We have a snychronisation problem.
How do I solve it?
I have messed with timestamps and a buffer. But this is not a good solution again because there is no guarantee that after assigning a timestamp the message will be put into the buffer immediately afterwards.
I shall add that I do not know the clients. That is, I only have one open connection in each thread on the server. I do not have an array of clients or something like that to keep track of all the clients.
I suggest that you implement a queue for each client proxy (that's the object that manages the communication with each client).
Each iteration of your server object's (on its own thread) work:
1. It reads messages from the queues of all client proxies first
2. Decides if it needs to send out any messages based on its internal logic and incoming messages
3. Prepares and puts any outgoing messages to the queues of all its client proxies.
The client proxy thread work schedule is this:
1. Read from the communication.
2. Write to the queue from client proxy to server (if received any messages).
3. Read from the queue from server to client proxy.
4. Write to communication channel to client (if needed).
You may have to have a mutex on each queue.
Hope that helps
For example we have a basic node.js server <-> client comunicaciton.
A basic node.js server who sends each 500ms a message to the only o every one client connected with their respective socket initiated, the client is responding correctly to the heratbeat and receiving all the messages in time. But, imagine the client has a temporal connection lag (without closing socket), cpu overload, etc.. And cannot process nothing during 2secs or more.
In this situation, where goes all those the messages that are not yet received by the client??
They are stored in node? in any buffer or similar?
And viceversa? The client is sending every 500ms a message to the server (the server only listens without responding), but the server has a temporary connection issue or cpu overhead during 2 or 3 secs..
Thanks in advice!! any information or aclaration will be welcomed
Javier
Yes, they are stored in buffers, primarily in buffers provided by the OS kernel. Same thing on the receiving end for connections incoming to a node server.