How to send Websocket messages with the least amount of latency? - node.js

I'm working on a Websocket server programmed in Node.js and I'm planning to send out a multicast message to upwards of 20,000 users. The message will be sent out on a regular interval of a second, however I am worried about the performance from the Node.js server.
I understand that Node.js works asynchronously and creates and destroys threads as it requires, but I am unsure of its efficiency. Ideally I would like to send out the message with an average latency of 5ms.
Currently I'm sending out messages to all users through running through a for loop of all the connected clients as following:
function StartBroadcastMessage()
{
console.log("broadcasting socket data to client...!");
for(var i=0;i < clientsWithEvents.length;i++){ //Runs through all the clients
client = clientsWithEvents[i];
if(client.eventid.toLowerCase() == serverEventName.toLowerCase()) //Checks to see if the Client Event names and server names are the same
client.connection.sendUTF(GetEventSocketFeed(client.eventid)); //Sends out event data to that particular client
}
timeoutId = setTimeout(StartBroadcastMessage,2*1000);
}
Is this an efficient way of sending out a multicast message with a low latency, or is there a better way?
Also, is there an efficient way to perform a load test on the server simulating a number of devices connected to the Websocket server? (So far I have found this Node app https://github.com/qarea/websockets-stress-test)

You can use socket.io to broad cast message.
var io = require('socket.io').listen(80);
io.sockets.on('connection', function (socket) {
socket.broadcast.emit('user connected');
});
This will avoid latency(iterating all socket objects and formatting message) in sending individual message to client.

Related

How to prevent socket io client (browser) from being overflowed with huge payload coming from server?

I have a React.js client app and a Node.js server app and the Node.js app receives json data in real time via socket.io from another microservice. The JSON data is sent very often and this breaks the client app. For example:
I stop the server but the client still receives data
If I try refreshing the browser, it takes a lot of time to refresh
It also used to disconnect and reconnect the sockets (I fixed this by increasing the pingTimeout but that did not solve the other problems)
I also used maxHttpBufferSize and updateTimeout by increasing them but that does not really help. Decreasing the maxHttpBufferSize stops the messages from being received but I want them to be received just in a manner which does not break my client application.
Any advices on what I can do to improve my situation?
EDIT:
It could also work if I do not send all messages but skip every second or so but I am not sure how to achieve this?
Backpressure can be implemented with acknowledgements:
the client notifies the server when it has successfully handled the packet
socket.on("my-event", (data, cb) => {
// do something with the data
// then call the callback
cb();
});
the server must wait for the acknowledgement before sending more packets
io.on("connection", (socket) => {
socket.emit("my-event", data, () => {
// the client has acknowledged the packet, we can continue
});
})
Reference: https://socket.io/docs/v4/emitting-events/#acknowledgements
Note: using volatile packets won't work here, because the amount of data buffered on the server is not taken in account
Reference: https://github.com/websockets/ws/blob/master/doc/ws.md#websocketbufferedamount

I can't get my head around websockets (via socket.io and node.js)

I'm new to websockets/socket.io/node.js. I'm trying to write a card game app, but pretty much all the example tutorials I've found are creating chat applications. So I'm struggling to get my head around the concepts and how they can be applied to my card game.
Keeping it simple, the card game will involve two players. The game involves moving cards around the table. Each player has to see the other player's moves as they happen (hence the need for constant connections). But the opponents cards are concealed to the other.
So two people browse to the same table then click to sit (and play, when both seats are taken). Using
io.on("connection", function(sock){
//socket events in here
});
am I creating the one socket ('io', or 'sock'?) that both clients and the server share, or is that two separate sockets (server/clientA and sever/clientB)? I ask, because I'm struggling to understand what's happening when a message is emitted and/or broadcast. If a client emits a message, is that message sent to both the server and the other client, or just the server? And then, further does it also send the message to itself as well?? It seems as though that's the logic... or what is the purpose of the 'broadcast' method?
From a functional perspective, I need the server to send different messages to each player. So it's not like a chatroom where the server sends the chat to everyone. But if it's one socket that the three of us share (clients and server), how do I manage who sees what? I've read about namespaces, but I'm struggling to work out how that could be used. And if it's two separate sockets, then I can more easily imagine sending different data to the separate clients. But how is that implemented - is that two 'io' objects, or two 'sock' objects?
Finally, I've got no idea if this is the sort of long-winded question that is accepted here, so if it's not, can someone direct me to a forum that discussions can occur? Cheers!
(in case it matters I'm also using Expressjs as the server).
Edit to add:
Part of my confusion is regarding the difference between 'io' and 'sock'. This code eg from the socket.io page is a good example of methods being applied to either of them:
io.on('connection', function(socket){
socket.emit('request', /* */); // emit an event to the socket
io.emit('broadcast', /* */); // emit an event to all connected sockets
socket.on('reply', function(){ /* */ }); // listen to the event
});
WebSocket server side listens for incoming socket connections from clients.
Each client upon connection opens its own socket between him and server. The server is the one that keeps track of all clients.
So once client emits the message server is listening for, the server can do with that message whatever. The message itself can contain information about who is the recipient of that message.
The server can pass the message to everyone or broadcast it to specific user or users based on information your client has sent you or some other logic.
For a card game:
The server listens for incoming connections. Once two clients are connected both of them should emit game ID in which they want to participate. The server can join their sockets in one game(Room) and all of the communication between those two clients can continue in that room. Each time one of the clients passes data to the server, that data should contain info about the recipient.
Here is one simple example that could maybe get you going:
Client side
// set-up a connection between the client and the server
var socket = io.connect();
// get some game identifier
var game = "thebestgameever";
socket.on('connect', function() {
// Let server know which game you want to play
socket.emit('game', game);
});
function makeAMove(move)
{
socket.emit('madeAMove', {move:move, game:game});
}
socket.on('move', function(data) {
console.log('Player made a move', data);
});
Server side
io = socketio.listen(server);
//listen for new connections from clients
io.sockets.on('connection', function(socket) {
// if client joined game get his socket assigned to the game
socket.on('game', function(game) {
socket.join(game);
});
socket.on('madeAMove', function(data){
let game = data.game;
let move = data.move;
io.sockets.in(game).emit('move', move);
});
})

why is performance of redis+socket.io better than just socket.io?

I earlier had all my code in socket.io+node.js server. I recently converted all the code to redis+socket.io+socket.io+node.js after noticing slow performance when too many users send messages across the server.
So, why socket.io alone was slow because it is not multi threaded, so it handles one request or emit at a time.
What redis does is distribute these requests or emits across channels. Clients subscribe to different channels, and when a message is published on a channel, all the client subscribed to it receive the message. It does it via this piece of code:
sub.on("message", function (channel, message) {
client.emit("message",message);
});
The client.on('emit',function(){}) takes it from here to publish messages to different channels.
Here is a brief code explaining what i am doing with redis:
io.sockets.on('connection', function (client) {
var pub = redis.createClient();
var sub = redis.createClient();
sub.on("message", function (channel, message) {
client.emit('message',message);
});
client.on("message", function (msg) {
if(msg.type == "chat"){
pub.publish("channel." + msg.tousername,msg.message);
pub.publish("channel." + msg.user,msg.message);
}
else if(msg.type == "setUsername"){
sub.subscribe("channel." +msg.user);
}
});
});
As redis stores the channel information, we can have different servers publish to the same channel.
So, what i dont understand is, if sub.on("message") is getting called every time a request or emit is sent, why is redis supposed to be giving better performance? I suppose even the sub.on("message") method is not multi threaded.
As you might know, Redis allows you to scale with multiple node instances. So the performance actually comes after the fact. Utilizing the Pub/Sub method is not faster. It's technically slower because you have to communicate between Redis for every Pub/Sign signal. The "giving better performance" is only really true when you start to horizontally scale out.
For example, you have one node instance (simple chat room) -- that can handle a maximum of 200 active users. You are not using Redis yet because there is no need. Now, what if you want to have 400 active users? Whilst using your example above, you can now achieve this 400 user mark, which is a "performance increase". In the sense you can now handle more users, but not really a speed increase. If that makes sense. Hope this helps!

Websocket transport reliability (Socket.io data loss during reconnection)

Used
NodeJS, Socket.io
Problem
Imagine there are 2 users U1 & U2, connected to an app via Socket.io. The algorithm is the following:
U1 completely loses Internet connection (ex. switches Internet off)
U2 sends a message to U1.
U1 does not receive the message yet, because the Internet is down
Server detects U1 disconnection by heartbeat timeout
U1 reconnects to socket.io
U1 never receives the message from U2 - it is lost on Step 4 I guess.
Possible explanation
I think I understand why it happens:
on Step 4 Server kills socket instance and the queue of messages to U1 as well
Moreover on Step 5 U1 and Server create new connection (it is not reused), so even if message is still queued, the previous connection is lost anyway.
Need help
How can I prevent this kind of data loss? I have to use hearbeats, because I do not people hang in app forever. Also I must still give a possibility to reconnect, because when I deploy a new version of app I want zero downtime.
P.S. The thing I call "message" is not just a text message I can store in database, but valuable system message, which delivery must be guaranteed, or UI screws up.
Thanks!
Addition 1
I do already have a user account system. Moreover, my application is already complex. Adding offline/online statuses won't help, because I already have this kind of stuff. The problem is different.
Check out step 2. On this step we technically cannot say if U1 goes offline, he just loses connection lets say for 2 seconds, probably because of bad internet. So U2 sends him a message, but U1 doesn't receive it because internet is still down for him (step 3). Step 4 is needed to detect offline users, lets say, the timeout is 60 seconds. Eventually in another 10 seconds internet connection for U1 is up and he reconnects to socket.io. But the message from U2 is lost in space because on server U1 was disconnected by timeout.
That is the problem, I wan't 100% delivery.
Solution
Collect an emit (emit name and data) in {} user, identified by random emitID. Send emit
Confirm the emit on client side (send emit back to server with emitID)
If confirmed - delete object from {} identified by emitID
If user reconnected - check {} for this user and loop through it executing Step 1 for each object in {}
When disconnected or/and connected flush {} for user if necessary
// Server
const pendingEmits = {};
socket.on('reconnection', () => resendAllPendingLimits);
socket.on('confirm', (emitID) => { delete(pendingEmits[emitID]); });
// Client
socket.on('something', () => {
socket.emit('confirm', emitID);
});
Solution 2 (kinda)
Added 1 Feb 2020.
While this is not really a solution for Websockets, someone may still find it handy. We migrated from Websockets to SSE + Ajax. SSE allows you to connect from a client to keep a persistent TCP connection and receive messages from a server in realtime. To send messages from a client to a server - simply use Ajax. There are disadvantages like latency and overhead, but SSE guarantees reliability because it is a TCP connection.
Since we use Express we use this library for SSE https://github.com/dpskvn/express-sse, but you can choose the one that fits you.
SSE is not supported in IE and most Edge versions, so you would need a polyfill: https://github.com/Yaffle/EventSource.
Others have hinted at this in other answers and comments, but the root problem is that Socket.IO is just a delivery mechanism, and you cannot depend on it alone for reliable delivery. The only person who knows for sure that a message has been successfully delivered to the client is the client itself. For this kind of system, I would recommend making the following assertions:
Messages aren't sent directly to clients; instead, they get sent to the server and stored in some kind of data store.
Clients are responsible for asking "what did I miss" when they reconnect, and will query the stored messages in the data store to update their state.
If a message is sent to the server while the recipient client is connected, that message will be sent in real time to the client.
Of course, depending on your application's needs, you can tune pieces of this--for example, you can use, say, a Redis list or sorted set for the messages, and clear them out if you know for a fact a client is up to date.
Here are a couple of examples:
Happy path:
U1 and U2 are both connected to the system.
U2 sends a message to the server that U1 should receive.
The server stores the message in some kind of persistent store, marking it for U1 with some kind of timestamp or sequential ID.
The server sends the message to U1 via Socket.IO.
U1's client confirms (perhaps via a Socket.IO callback) that it received the message.
The server deletes the persisted message from the data store.
Offline path:
U1 looses internet connectivity.
U2 sends a message to the server that U1 should receive.
The server stores the message in some kind of persistent store, marking it for U1 with some kind of timestamp or sequential ID.
The server sends the message to U1 via Socket.IO.
U1's client does not confirm receipt, because they are offline.
Perhaps U2 sends U1 a few more messages; they all get stored in the data store in the same fashion.
When U1 reconnects, it asks the server "The last message I saw was X / I have state X, what did I miss."
The server sends U1 all the messages it missed from the data store based on U1's request
U1's client confirms receipt and the server removes those messages from the data store.
If you absolutely want guaranteed delivery, then it's important to design your system in such a way that being connected doesn't actually matter, and that realtime delivery is simply a bonus; this almost always involves a data store of some kind. As user568109 mentioned in a comment, there are messaging systems that abstract away the storage and delivery of said messages, and it may be worth looking into such a prebuilt solution. (You will likely still have to write the Socket.IO integration yourself.)
If you're not interested in storing the messages in the database, you may be able to get away with storing them in a local array; the server tries to send U1 the message, and stores it in a list of "pending messages" until U1's client confirms that it received it. If the client is offline, then when it comes back it can tell the server "Hey I was disconnected, please send me anything I missed" and the server can iterate through those messages.
Luckily, Socket.IO provides a mechanism that allows a client to "respond" to a message that looks like native JS callbacks. Here is some pseudocode:
// server
pendingMessagesForSocket = [];
function sendMessage(message) {
pendingMessagesForSocket.push(message);
socket.emit('message', message, function() {
pendingMessagesForSocket.remove(message);
}
};
socket.on('reconnection', function(lastKnownMessage) {
// you may want to make sure you resend them in order, or one at a time, etc.
for (message in pendingMessagesForSocket since lastKnownMessage) {
socket.emit('message', message, function() {
pendingMessagesForSocket.remove(message);
}
}
});
// client
socket.on('connection', function() {
if (previouslyConnected) {
socket.emit('reconnection', lastKnownMessage);
} else {
// first connection; any further connections means we disconnected
previouslyConnected = true;
}
});
socket.on('message', function(data, callback) {
// Do something with `data`
lastKnownMessage = data;
callback(); // confirm we received the message
});
This is quite similar to the last suggestion, simply without a persistent data store.
You may also be interested in the concept of event sourcing.
Michelle's answer is pretty much on point, but there are a few other important things to consider. The main question to ask yourself is: "Is there a difference between a user and a socket in my app?" Another way to ask that is "Can each logged in user have more than 1 socket connection at one time?"
In the web world it is probably always a possibility that a single user has multiple socket connections, unless you have specifically put something in place that prevents this. The simplest example of this is if a user has two tabs of the same page open. In these cases you don't care about sending a message/event to the human user just once... you need to send it to each socket instance for that user so that each tab can run it's callbacks to update the ui state. Maybe this isn't a concern for certain applications, but my gut says it would be for most. If this is a concern for you, read on....
To solve this (assuming you are using a database as your persistent storage) you would need 3 tables.
users - which is a 1 to 1 with real people
clients - which represents a "tab" that could have a single connection to a socket server. (any 'user' may have multiple)
messages - a message that needs sent to a client (not a message that needs sent to a user or to a socket)
The users table is optional if your app doesn't require it, but the OP said they have one.
The other thing that needs properly defined is "what is a socket connection?", "When is a socket connection created?", "when is a socket connection reused?". Michelle's psudocode makes it seem like a socket connection can be reused. With Socket.IO, they CANNOT be reused. I've seen be the source of a lot of confusion. There are real life scenarios where Michelle's example does make sense. But I have to imagine those scenarios are rare. What really happens is when a socket connection is lost, that connection, ID, etc will never be reused. So any messages marked for that socket specifically will never be delivered to anyone because when the client who had originally connected, reconnects, they get a completely brand new connection and new ID. This means it's up to you to do something to track clients (rather than sockets or users) across multiple socket connections.
So for a web based example here would be the set of steps I'd recommend:
When a user loads a client (typically a single webpage) that has the potential for creating a socket connection, add a row to the clients database which is linked to their user ID.
When the user actually does connect to the socket server, pass the client ID to the server with the connection request.
The server should validate the user is allowed to connect and the client row in the clients table is available for connection and allow/deny accordingly.
Update the client row with the socket ID generated by Socket.IO.
Send any items in the messages table connected to the client ID. There wouldn't be any on initial connection, but if this was from the client trying to reconnect, there may be some.
Any time a message needs to be sent to that socket, add a row in the messages table which is linked to the client ID you generated (not the socket ID).
Attempt to emit the message and listen for the client with the acknowledgement.
When you get the acknowledgement, delete that item from the messages table.
You may wish to create some logic on the client side that discards duplicate messages sent from the server since this is technically a possibility as some have pointed out.
Then when a client disconnects from the socket server (purposefully or via error), DO NOT delete the client row, just clear out the socket ID at most. This is because that same client could try to reconnect.
When a client tries to reconnect, send the same client ID it sent with the original connection attempt. The server will view this just like an initial connection.
When the client is destroyed (user closes the tab or navigates away), this is when you delete the client row and all messages for this client. This step may be a bit tricky.
Because the last step is tricky (at least it used to be, I haven't done anything like that in a long time), and because there are cases like power loss where the client will disconnect without cleaning up the client row and never tries to reconnect with that same client row - you probably want to have something that runs periodically to cleanup any stale client and message rows. Or, you can just permanently store all clients and messages forever and just mark their state appropriately.
So just to be clear, in cases where one user has two tabs open, you will be adding two identical message to the messages table each marked for a different client because your server needs to know if each client received them, not just each user.
As already written in another answer, I also believe you should look at the realtime as a bonus : the system should be able to work also with no realtime.
I’m developing an enterprise chat for a large company (ios, android, web frontend and .net core + postGres backend) and after having developed a way for the websocket to re-establish connection (through a socket uuid) and get undelivered messages (stored in a queue) I understood there was a better solution: resync via rest API.
Basically I ended up by using websocket just for realtime, with an integer tag on each realtime message (user online, typers, chat message and so on) for monitoring lost messages.
When the client gets an id which is not monolithic (+1) then it understands it is out of sync so it drops all the socket messages and asks a resync of all its observers through REST api.
This way we can handle many variations in the state of the application during the offline period without having to parse tons of websocket messages in a row on reconnection and we are sure to be synced (because the last sync date is set just by the REST api, not from the socket).
The only tricky part is monitoring for realtime messages from the moment you call REST api to the moment the server replies because what is read from the db takes time to get back to the client and in the meanwhile variations could happen so they need to be cached and took into account.
We are going into production in a couple of months,
I hope to get back sleeping by then :)
It is seem that you already have user account system. You know which account is online/offline, you you can handle connect/disconnect event:
So the solution is, add online/offline and offline messages on database for each user:
chatApp.onLogin(function (user) {
user.readOfflineMessage(function (msgs) {
user.sendOfflineMessage(msgs, function (err) {
if (!err) user.clearOfflineMessage();
});
})
});
chatApp.onMessage(function (fromUser, toUser, msg) {
if (user.isOnline()) {
toUser.sendMessage(msg, function (err) {
// alert CAN NOT SEND, RETRY?
});
} else {
toUser.addToOfflineQueue(msg);
}
})
Look here: Handle browser reload socket.io.
I think you could use solution which I came up with. If you modify it properly, it should work as you want.
What I think you want is to have a reusable socket for each user, something like:
Client:
socket.on("msg", function(){
socket.send("msg-conf");
});
Server:
// Add this socket property to all users, with your existing user system
user.socket = {
messages:[],
io:null
}
user.send = function(msg){ // Call this method to send a message
if(this.socket.io){ // this.io will be set to null when dissconnected
// Wait For Confirmation that message was sent.
var hasconf = false;
this.socket.io.on("msg-conf", function(data){
// Expect the client to emit "msg-conf"
hasconf = true;
});
// send the message
this.socket.io.send("msg", msg); // if connected, call socket.io's send method
setTimeout(function(){
if(!hasconf){
this.socket = null; // If the client did not respond, mark them as offline.
this.socket.messages.push(msg); // Add it to the queue
}
}, 60 * 1000); // Make sure this is the same as your timeout.
} else {
this.socket.messages.push(msg); // Otherwise, it's offline. Add it to the message queue
}
}
user.flush = function(){ // Call this when user comes back online
for(var msg in this.socket.messages){ // For every message in the queue, send it.
this.send(msg);
}
}
// Make Sure this runs whenever the user gets logged in/comes online
user.onconnect = function(socket){
this.socket.io = socket; // Set the socket.io socket
this.flush(); // Send all messages that are waiting
}
// Make sure this is called when the user disconnects/logs out
user.disconnect = function(){
self.socket.io = null; // Set the socket to null, so any messages are queued not send.
}
Then the socket queue is preserved between disconnects.
Make sure it saves each users socket property to the database and make the methods part of your user prototype. The database does not matter, just save it however you have been saving your users.
This will avoid the problem mentioned in Additon 1 by requiring a confirmation from the client before marking the message as sent. If you really wanted to, you could give each message an id and have the client send the message id to msg-conf, then check it.
In this example, user is the template user that all users are copied from, or like the user prototype.
Note: This has not been tested.
Been looking at this stuff latterly and think different path might be better.
Try looking at Azure Service bus, ques and topic take care of the off line states.
The message wait for user to come back and then they get the message.
Is a cost to run a queue but its like $0.05 per million operations for a basic queue so cost of dev would be more from hours work need to write a queuing system.
https://azure.microsoft.com/en-us/pricing/details/service-bus/
And azure bus has libraries and examples for PHP, C#, Xarmin, Anjular, Java Script etc.
So server send message and does not need to worry about tracking them.
Client can use message to send back also as means can handle load balancing if needed.
Try this emit chat list
io.on('connect', onConnect);
function onConnect(socket){
// sending to the client
socket.emit('hello', 'can you hear me?', 1, 2, 'abc');
// sending to all clients except sender
socket.broadcast.emit('broadcast', 'hello friends!');
// sending to all clients in 'game' room except sender
socket.to('game').emit('nice game', "let's play a game");
// sending to all clients in 'game1' and/or in 'game2' room, except sender
socket.to('game1').to('game2').emit('nice game', "let's play a game (too)");
// sending to all clients in 'game' room, including sender
io.in('game').emit('big-announcement', 'the game will start soon');
// sending to all clients in namespace 'myNamespace', including sender
io.of('myNamespace').emit('bigger-announcement', 'the tournament will start soon');
// sending to individual socketid (private message)
socket.to(<socketid>).emit('hey', 'I just met you');
// sending with acknowledgement
socket.emit('question', 'do you think so?', function (answer) {});
// sending without compression
socket.compress(false).emit('uncompressed', "that's rough");
// sending a message that might be dropped if the client is not ready to receive messages
socket.volatile.emit('maybe', 'do you really need it?');
// sending to all clients on this node (when using multiple nodes)
io.local.emit('hi', 'my lovely babies');
};

node.js + socket.io broadcast from server, rather than from a specific client?

I'm building a simple system like a realtime news feed, using node.js + socket.io.
Since this is a "read-only" system, clients connect and receive data, but clients never actually send any data of their own. The server generates the messages that needs to be sent to all clients, no client generates any messages; yet I do need to broadcast.
The documentation for socket.io's broadcast (end of page) says
To broadcast, simply add a broadcast flag to emit and send method calls. Broadcasting means sending a message to everyone else except for the socket that starts it.
So I currently capture the most recent client to connect, into a variable, then emit() to that socket and broadcast.emit() to that socket, such that this new client gets the new data and all the other clients. But it feels like the client's role here is nothing more than a workaround for what I thought socket.io already supported.
Is there a way to send data to all clients based on an event initiated by the server?
My current approach is roughly:
var socket;
io.sockets.on("connection", function (s) {
socket = s;
});
/* bunch of real logic, yadda yadda ... */
myServerSideNewsFeed.onNewEntry(function (msg) {
socket.emit("msg", { "msg" : msg });
socket.broadcast.emit("msg", { "msg" : msg });
});
Basically the events that cause data to require sending to the client are all server-side, not client-side.
Why not just do like below?
io.sockets.emit('hello',{msg:'abc'});
Since you are emitting events only server side, you should create a custom EventEmitter for your server.
var io = require('socket.io').listen(80);
events = require('events'),
serverEmitter = new events.EventEmitter();
io.sockets.on('connection', function (socket) {
// here you handle what happens on the 'newFeed' event
// which will be triggered by the server later on
serverEmitter.on('newFeed', function (data) {
// this message will be sent to all connected users
socket.emit(data);
});
});
// sometime in the future the server will emit one or more newFeed events
serverEmitter.emit('newFeed', data);
Note: newFeed is just an event example, you can have as many events as you like.
Important
The solution above is better also because in the future you might need to emit certain messages only to some clients, not all (thus need conditions). For something simpler (just emit a message to all clients no matter what), io.sockets.broadcast.emit() is a better fit indeed.

Resources