I am designing a system with Socket.io, Nodejs, Expressjs, Redis, Angularjs in which message delivery is very crucial.
Use Case:
My system needs to listen for notifications on resources which can also be shared among multiple users. Sockets are connected to the server with information about resource so that I can maintain a list of sockets for each resource. When there is a notification for that resource I send notification on every socket for that resource.
ResourceID1 = ["scoketID1", "socketID2"] // Two sockets listening to ResourceID1
I update this list on socket disconnect also by removing that specific socket id out or resourceID list.
This way I make sure notification is sent to every user which is sharing the resource and every session of that user along with every socket which can be multiple tabs for one session. In short "Every Socket".
I am also maintaining a hash for the resource notification so that every time there is a new notification for a resource the particular resource notification hash will be updated.
ResourceID1-notificationHash : { key1:"124", key2: "abc" }
The reason for doing this is that if the socket disconnected like user closed the tab or more importantly user Internet disconnected for some reason. and the server does not receive heart beat socket connection is made again and the last message for that resource is sent to that socket.
To make sure that particular socket(client) always receive the message is my implementation enough?
I know I can implement callbacks for an event but what I did is when a particular tab(socket)that was disconnected, is connected again I send the last message for that resource.
Are there any other network glitches which can cause problems for reliable message delivery?
My implementation is different in a sense that for message delivery I am not considering Users, I am considering sockets which can be multiple for just one user like one user can have multiple sessions at the same time lets say in firefox and chrome and then for each session he can have multiple tabs opened. Reason for choosing sockets is my resources can be shared among multiple users so that for one resource update I may have to send the message to multiple users each having multiple sessions and each session having multiple sockets.
Related
Intro
We're developing a system to support multiple real-time messages (chat) and updates (activity notifications).
That is, user A can receive via Web Socket messages for :
receiving new chat messages
receiving updates for some activity, for example if someone like their photo.
and more.
We use one single WebSocket connection to send all these different messages to the client.
However, we also support multiple applications/clients to be open by the user at the same time.
(i.e - user A connect on their web browser, and also from their mobile app, at the same time).
Architecture
We have a "Hub" that stores a map of UserId to a list of active websocket sessions.
(user:123 -> listOf(session#1, session#2))
Each client, once websocket connection is established, has its own Consumer which subscribes to a pulsar topic "userId" (e.g - user:123 topic).
If user A connected on both mobile and web, each client has its own Consumer to topic user:A.
When user A sends a new message from session #1 to user B, the flow is :
user makes a REST POST request to send a message.
service stores a new message to DB.
service sends a Pulsar message to topic user:B and user:A.
return 200 status code + created Message response.
Problem
If user A has two sessions open (two clients/websockets), and they send a message from session #1, how can we make sure only session #2 gets the message ?
Since user A has already received the 200 response with the created message in session #1, there's no need to send the message to him again by sending a message to his Consumer.
I'm not sure if it's a Pulsar configuration, or perhaps our architecture is wrong.
how can we make sure only session #2 gets the message ?
I'm going to address this at the app level.
Prepend a unique nonce (e.g. a guid) to each message sent.
Maintain a short list of recently sent nonces,
aging them out so we never have more than, say, half a dozen.
Upon receiving a message,
check to see if we sent it.
That is, check to see if its nonce is in the list.
If so, silently discard it.
Equivalently, name each connection.
You could roll a guid just once when a new websocket is opened.
Or you could incorporate some of the websocket's addressing
bits into the name.
Prepend the connection name to each outbound message.
Discard any received message which has "sender" of "self".
With this de-dup'ing approach
there's still some wasted network bandwidth.
We can quibble about it if you wish.
When the K-th websocket is created,
we could create K topics,
each excluding a different endpoint.
Sounds like more work than it's worth!
I assigned myself with the task of implementing the chat app (1:1) for my curriculum.Among the various options I used SSE for real time chats.From the example projects I am able to implement the non persistent chat between two clients.In every examples they uses js object and array to store the res object and by iterating them they sent events to particular user.But when implementing the real time chat app the users may increase dramatically So it is not good to exhaust server resources.
I found the some of the other ways to achieve same
functionality but not sure about the performance
SSE+setInterval
I used redis Queue to push offline messages to the user.
when the user establishes the connection push all the unread chats to client.
This process happens immediately when client establishes connection with server.
I faced some problem here, as I have no way of triggering the messages in real time(when both users online).
So I used setInterval with time interval of 1 second for real time communication and write a callback function to check if the Queue is empty else pop message from Queue and sent to user as an event.
Will the above solutions affect performance ? Because I am calling the function for each connected user x 1 second interval.
Long polling
In long polling how can I find if there is new message for user and complete the request ?
Still here setInterval should be used in server side but what about performance?
Websockets
In websockets we have an unique id to find the client in the pool of clients, so we can forward message to particular user when event occurs.
Still websockets uses some ping pong mechanism to make connection persistent but resource utilization is very small as they are network calls with comparatively small data and handled asynchronously so no wastage in server resource.
Questions
How to trigger res.write only when the new message arrives to particular user?
Does SSE+setInterval or longpolling+setInterval degrades performance when user increases?
Else is there any design pattern to achieve this functionality?
Simply use websocket.
It's fast, convinient and simple.
To send message in realtime when both users are logged, find second user by id in users Array or Map and send received message to his websocket.
If you have buffered messages for disconnected user (in memory/database/redis) check it when user connects and send if it exists.
I have multiple processes & servers, and am using socket.io-redis with session affinity (on Heroku).
When a client connects, I store a map of the client's user ID to the client's socket ID in Redis. Upon disconnecting, I delete the client's user ID from Redis.
I would like to be able to push a message to a client by looking up the client's socket ID, then sending a message to the specific socket ID.
One way to do this is io.to(socketId).emit(...). However, this does not allow me to send a callback with the message, which I would really like because I want to have confirmation of message receipt.
Another way I have found is io.connected[socketId].emit(...). The problem with this is that it does not scale to multiple servers, since the io.connected object may not necessarily be the same on every server since it only records sockets that the server itself is connected to. This solution does allow callbacks, however.
Is there a way to solve this so I can emit messages to a specific socket ID from any server or process, and also send a callback with the message?
So i currently have a chat system running NodeJS that passes messages via rabbit and each connected user has their own unique queue that subscribed and only listening to messages (for only them). The backend can also use this chat pipeline to communicate other system messages like notifications/friend requests and other user event driven information.
Currently the backend would have to loop and publish each message 1 by 1 per user even if the payload of the message is the same for let's say 1000 users. I would like to get away from that and be able to send the same message to multiple different users but not EVERY user who's connected.
(example : notifying certain users their friend has come online).
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
Proposed - infrastructure
This way the backend does not need to loop for 100s and 1000s of users and can send a single payload containing all users this message should go to. I do plan to cluster the nodejs servers together.
I was also wondering since ive never done this in a production environment, will i need to track each socketID.
Potential pitfalls i've identified so far:
slower since 1000s of messages can pile up in a single queue.
manually storing socket IDs to manually trasmit to users.
offloading routing to NodeJS instead of RabbitMQ
Has anyone done anything like this before? If so, what are your recommendations. Is it better to scale with user unique queues, or pool all grouped messages for all users into smaller (but larger pools) of queues.
as a general rule, queue-per-user is an anti-pattern. there are some valid uses of this, but i've never seen it be a good idea for a chat app (in spite of all the demos that use this example)
RabbitMQ can be a great tool for facilitating the delivery of messages between systems, but it shouldn't be used to push messages to users.
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
this is heading down the right direction, but you have to remember that RabbitMQ is not a database (see previous link, again).
you can't randomly seek specific messages that are sitting in the queue and then leave them there. they are first in, first out.
in a chat app, i would have rabbitmq handling the message delivery between your systems, but not involved in delivery to the user.
your thoughts on using web sockets are going to be the direction you want to head for this. either that, or Server Sent Events.
if you need persistence of messages (history, search, last-viewed location, etc) then use a database for that. keep a timestamp or other marker of where the user left off, and push messages to them starting at that spot.
you're concerns about tracking sockets for the users are definitely something to think about.
if you have multiple instances of your node server running sockets with different users connected, you'll need a way to know which users are connected to which node server.
this may be a good use case for rabbitmq - but not in a queue-per-user manner. rather, in a binding-per-user. you could have each node server create a queue to receive messages from the exchange where messages are published. the node server would then create a binding between the exchange and queue based on the user id that is logged in to that particular node server
this could lead to an overwhelming number of bindings in rmq, though.
you may need a more intelligent method of tracking which server has which users connected, or just ignore that entirely and broadcast every message to every node server. in that case, each server would publish an event through the websocket based on the who the message should be delivered to.
if you're using a smart enough websocket library, it will only send the message to the people that need it. socket.io did this, i know, and i'm sure other websocket libraries are smart like this, as well.
...
I probably haven't given you a concrete answer to your situation, and I'm sure you have a lot more context to consider. hopefully this will get you down the right path, though.
i have a problem over implementing sockets. Case:
the user has n number of rooms in his list,
user should be able to receive notifications from each of the rooms.
method 1) open a socket for each room user has. in this user has to open multiple sockets for each room
method 2) users opens a single socket with room name = userid,
node maintains a list ('room_user') of each room and users in that room (this can be done on connection).
eg
room_user:{
room1 : {
user1Id, user2Id
}
room2 : {
user1Id, user3Id
}
}
For sending a message the server gets the userid's from the list for a specified room and then emits the message in a loop to all users. In this approach the user has to open only one socket but the server has to emit the same message in a loop
i want to know which method would be better suited
If you consider the underlaying TCP/IP broadcast system, you would probably find that it is better that the user have a single websocket connection and the server loop and send the same message again and again (method 2 in your question).
Allow me to explain:
TCP/IP doesn't support broadcasting. For this reason, sending the same message to multiple connections is actually implemented by looping over the list of connections and sending the same message again and again...
It's true that your code will be moving the loop to a higher level of the application, but it would probably be better than having many connections that would hinder your ability to scale the application.