NodeJS socket.io Broadcast - node.js

I'm new to NodeJS, however I have been able to make progress to a certain extent.
As it Is, I have created a nodeJS chat server with socket.io while the client app is a Java desktop application. Every time a user logs out the server emits a broadcast notifying all sockets of the log out event. My fear is that as time goes on, if hundreds of thousand of users are connected it might consume too much of the server resources to continually broadcast to every body each time a user logs out (which could be at a rate of more that one log out per second).
In view, of the above mentioned I'm considering refactoring the code such that each time a user logs out, a database will be queried for the friends of the logged out user so that ONLY the users friends will be notified every time he/she logs out rather that the whole world. But I'm not sure if it's a better way to go about it (considering the database queries per log out), so I am throwing the question to the public.
Which of the stated approaches is a better practice ?

Related

performance of real time chatting techniques

I assigned myself with the task of implementing the chat app (1:1) for my curriculum.Among the various options I used SSE for real time chats.From the example projects I am able to implement the non persistent chat between two clients.In every examples they uses js object and array to store the res object and by iterating them they sent events to particular user.But when implementing the real time chat app the users may increase dramatically So it is not good to exhaust server resources.
I found the some of the other ways to achieve same
functionality but not sure about the performance
SSE+setInterval
I used redis Queue to push offline messages to the user.
when the user establishes the connection push all the unread chats to client.
This process happens immediately when client establishes connection with server.
I faced some problem here, as I have no way of triggering the messages in real time(when both users online).
So I used setInterval with time interval of 1 second for real time communication and write a callback function to check if the Queue is empty else pop message from Queue and sent to user as an event.
Will the above solutions affect performance ? Because I am calling the function for each connected user x 1 second interval.
Long polling
In long polling how can I find if there is new message for user and complete the request ?
Still here setInterval should be used in server side but what about performance?
Websockets
In websockets we have an unique id to find the client in the pool of clients, so we can forward message to particular user when event occurs.
Still websockets uses some ping pong mechanism to make connection persistent but resource utilization is very small as they are network calls with comparatively small data and handled asynchronously so no wastage in server resource.
Questions
How to trigger res.write only when the new message arrives to particular user?
Does SSE+setInterval or longpolling+setInterval degrades performance when user increases?
Else is there any design pattern to achieve this functionality?
Simply use websocket.
It's fast, convinient and simple.
To send message in realtime when both users are logged, find second user by id in users Array or Map and send received message to his websocket.
If you have buffered messages for disconnected user (in memory/database/redis) check it when user connects and send if it exists.

Chat / System Communication App (Nodejs + RabbitMQ)

So i currently have a chat system running NodeJS that passes messages via rabbit and each connected user has their own unique queue that subscribed and only listening to messages (for only them). The backend can also use this chat pipeline to communicate other system messages like notifications/friend requests and other user event driven information.
Currently the backend would have to loop and publish each message 1 by 1 per user even if the payload of the message is the same for let's say 1000 users. I would like to get away from that and be able to send the same message to multiple different users but not EVERY user who's connected.
(example : notifying certain users their friend has come online).
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
Proposed - infrastructure
This way the backend does not need to loop for 100s and 1000s of users and can send a single payload containing all users this message should go to. I do plan to cluster the nodejs servers together.
I was also wondering since ive never done this in a production environment, will i need to track each socketID.
Potential pitfalls i've identified so far:
slower since 1000s of messages can pile up in a single queue.
manually storing socket IDs to manually trasmit to users.
offloading routing to NodeJS instead of RabbitMQ
Has anyone done anything like this before? If so, what are your recommendations. Is it better to scale with user unique queues, or pool all grouped messages for all users into smaller (but larger pools) of queues.
as a general rule, queue-per-user is an anti-pattern. there are some valid uses of this, but i've never seen it be a good idea for a chat app (in spite of all the demos that use this example)
RabbitMQ can be a great tool for facilitating the delivery of messages between systems, but it shouldn't be used to push messages to users.
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
this is heading down the right direction, but you have to remember that RabbitMQ is not a database (see previous link, again).
you can't randomly seek specific messages that are sitting in the queue and then leave them there. they are first in, first out.
in a chat app, i would have rabbitmq handling the message delivery between your systems, but not involved in delivery to the user.
your thoughts on using web sockets are going to be the direction you want to head for this. either that, or Server Sent Events.
if you need persistence of messages (history, search, last-viewed location, etc) then use a database for that. keep a timestamp or other marker of where the user left off, and push messages to them starting at that spot.
you're concerns about tracking sockets for the users are definitely something to think about.
if you have multiple instances of your node server running sockets with different users connected, you'll need a way to know which users are connected to which node server.
this may be a good use case for rabbitmq - but not in a queue-per-user manner. rather, in a binding-per-user. you could have each node server create a queue to receive messages from the exchange where messages are published. the node server would then create a binding between the exchange and queue based on the user id that is logged in to that particular node server
this could lead to an overwhelming number of bindings in rmq, though.
you may need a more intelligent method of tracking which server has which users connected, or just ignore that entirely and broadcast every message to every node server. in that case, each server would publish an event through the websocket based on the who the message should be delivered to.
if you're using a smart enough websocket library, it will only send the message to the people that need it. socket.io did this, i know, and i'm sure other websocket libraries are smart like this, as well.
...
I probably haven't given you a concrete answer to your situation, and I'm sure you have a lot more context to consider. hopefully this will get you down the right path, though.

Sending messages between clients socket.io

I'm working on a chat application and using socket.io / node for that. Basically I came up with the following strategies:
Send message from the client which is received by the socket server which then sends it to the receiving client. On the background I store that to the message on the DB to be retrieved later if the user wishes to seee his old conversations.
The pros of this approach is that the user gets the message almost instantly since we don't wait for the DB operation to complete, but the con is that if the DB operation failed and exactly that time the client refreshed its page to fetch the message, it won't get that.
Send message form the client to the server, the server then stores it on the DB first and then only sends it to the receiving client.
The pros is that we make sure that the message will be received to the client only if its stored in the DB. The con is that it will be no way close to real time since we'll be doing a DB operation in between slowing down the message passing.
Send message to the client which then is stored on a cache layer(redis for example) and then instantly broadcast it to the receiving client. On background keep fetching records from redis and updating DB. If the client refreshes the page, we first look into the DB and then the redis layer.
The pros is that we make the communication faster and also make sure messages are presented correctly on demand. The con is that this is quite complex as compared to above implementations, and I'm wondering if there's any easier way to achieve this?
My question is whats the way to go if you're building a serious chat application that ensures both - faster communication and data persistence. What are some strategies that app like facebook, whatsapp etc. use for the same? I'm not looking for exact example, but a few pointers will help.
Thanks.
I would go for the option number 2. I've been doing myself Chat apps in node and I found out that this is the best option. Saving in a database takes few milliseconds, which includes the 0.x milliseconds to write in the databse and the few milliseconds of latency in communication ( https://blog.serverdensity.com/mongodb-benchmarks/ ).
SO I would consider this approach realtime. The good thing with this is that if it fails, you can display a message to the sender that it failed, for whatever reason.
Facebook, whatsapp and many other big messaging apps are based on XMPP (jabber) which is a very, very big protocol for instant messaging and everything is very well documented on how to do things but it is based in XML, so you still have to parse everything etc but luckily there are very good libraries to handle with xmpp. So if you want to go the common way, using XMPP you can, but most of the big players in this area are not following anymore all the standards, since does not have all the features we are used to use today.
I would go with doing my own version, actually, I already something made (similar to slack), if you want I could give you access to it in private.
So to end this, number 2 is the way to go (for me). XMPP is cool but brings also a lot of complexity.

Node.js - Handling hundreds of user preferences

I'm learning node.js (my web background is mainly PHP) and I'm loving it so far but I have the following question. In PHP and other similar languages, each request is a single lived execution of the script. All user preferences can be loaded, etc can be loaded and there's no issue there as once the script execution has been completed, all resources will be released.
In node.js, especially in a long running process like a chatroom (I'm using socket.io), you will have hundreds/thousands of users being handled by one process. Assuming for instance I have a chatroom with 200 people, and I want messages to be highlighted if it comes from a participant the user has deemed a "Friend", then I will have to loop through 200 users to see if the user is a friend or not (especially if chats are to be only sent to friends and not publicly).
Won't this be really slow, especially over time? Is there something I'm missing out on? In my small tests as the number of users as well as number of messages go up, the responsiveness of the server goes down noticeably.
If you are going to develop a complex chatroom, you have to consider design the server side code and maintain the clients information at the server side. For example, you have to map the newly connected client socket to variables at the server side, also if you want to introduce "Friend" feature you have to maintain those information at server side. So your server don't have to look up each client see if they are the correct message receivers.
With all those implemented, in the scenario of sending message to the public, at the server side we could first find all the "friend" sockets, then send the message highlighted as "Friend" to those sockets, then send normal text to others. For private message to Friend, it will be much easier as we only consider friends sockets.
So you still need to reuse some of your design patterns you've used in PHP, socket.io would only maintain the long connections for you, and that is all.

What is the best way to keep track of which users are online in nodejs?

So I am developing (more playing around with) a realtime game in node.js, I am also using Redis and Sockets.io. I have players create a lobby and join it (kind of like a pre-game chat room, where you can talk to players and select game settings) . The client is written in HTML/CSS/JS, Anyway I want to be able to tell when players disconnect from the lobby, to update the number of players joined on the interface (and joined player names).
Two options I have thought about are:
Using redis' key value timeout feature, to remove a particular field if it is not updated in x amount of time. I would then have the host check the existance of this field to check for DC's. I do wonder if this is highly inefficient, as many users potentially will be playing, so will it be bad to have many timeout values in redis and also many other users polling these fields.
I could use the sockets.io on('disconnect', ..) to update the field. However I am not sure if this event will fire if for example a users pc freezes?
Anyway I am open to any other ideas also!
Socket.io have a 'heartbeat' to check connection still alive. Default heartbeat timeout is 15s. You can read more about configuring it in this wiki. If heartbeat fails (user pc freezes) then socket.io will emit 'disconnect' event.
Socket.io should suffice. You can configure it to use heartbeats to ping the socket and check its health. If a user's computer freezes it will, in effect, not be able to respond to these heartbeats, causing it to force a disconnect.
To test this you could set up your Socket.io to use heartbeats, then connect via a browser onn a different computer. While in the browser past into the console an infinite loop. Causing it to simulate a freeze.

Resources