Thoughts on how you guys would/do handle this... If you need to send a webhook when something happens. Say, when a customer registers.. How would you guarantee that a webhook has been sent for it? You couldn't just have the code send the webhook after the registration happens, as what if the server crashes right after registration but before/during the webhook request.. If you were using MongoDB, I guess you could listen to the oplog and see what a registration record is inserted, but then what if your oplog monitoring script crashes.. I thought about storing locally a file that records the last successfully processed timestamp but then you're really limiting yourself to processing the oplog one at a time, which on an active DB could really get behind.
I thought about using a queuing/task system but then what if the request to store the task in the queue crashes..
Maybe store a field on the registration DB record that says if the webhook has been sent yet, send it once the registration happens and have a script that checks every once in awhile for webhooks that failed to send within a certain amount of time?
Just trying to think of the best way to make sure a webhook is actually sent for a particular event.
Related
Background
I have a monolith Node.js + PostgreSQL app that, besides other things, needs to provide real-time in-app notifications to end users.
It is currently implemented in the following way:
there's a db table notifications which has state (pending/sent), userid (id of the notification receiver), isRead (did a user read the notification), type and body - notification data.
once specific resources get created or specific events occur, a various number of users should receive in-app notifications. When a notification is created, it gets persisted to the db and gets sent to the user using WebSockets. Notifications can also get created by a cron job.
when a user receives N number of notifications of the same type, they get collapsed into one single notification. This is done via db trigger by deleting repeated notifications and inserting a new one.
usually it works fine. But when the number of receivers exceeds several thousands, the app lags or other requests get blocked or not all notifications get sent via WebSockets.
Examples of notifications
Article published
A user is awarded with points
A user logged in multiple times but didn't perform some action
One user sends a friend request to another
One user sent a message to another
if a user receives 3+ Article published notifications, they get collapsed into the N articles published notification (N gets updated if new same notifications get received).
What I currently have doesn't seem to work very well. For example, for the Article created event the api endpoint that handles the creation, also handles notifications send-outs (which is maybe not a good approach - it creates ~5-6k notifications and sends them to users via websockets).
Question
How to correctly design such functionality?
Should I stay with a node.js + db approach or add a queuing service? Redis Pub/Sub? RabbitMQ?
We deploy to the k8s cluster, so adding another service is not a problem. More important question - is it really needed in my case?
I would love some general advice or resources to read on this topic.
I've read several articles on messaging/queuing/notifications system design but still don't quite get if this fits my case.
Should the queue store the notifications or should they be in the db? What's the correct way to notify thousands of users in real-time (websockets? SSE?)?
Also, the more I read about queues and message brokers, the more it feels like I'm overcomplicating things and getting more confused.
Consider using the Temporal open source project. It would allow modeling each user lifecycle as a separate program. The Temporal makes the code fully fault tolerant and preserves its full state (including local variables and blocking await calls) across process restarts.
I assigned myself with the task of implementing the chat app (1:1) for my curriculum.Among the various options I used SSE for real time chats.From the example projects I am able to implement the non persistent chat between two clients.In every examples they uses js object and array to store the res object and by iterating them they sent events to particular user.But when implementing the real time chat app the users may increase dramatically So it is not good to exhaust server resources.
I found the some of the other ways to achieve same
functionality but not sure about the performance
SSE+setInterval
I used redis Queue to push offline messages to the user.
when the user establishes the connection push all the unread chats to client.
This process happens immediately when client establishes connection with server.
I faced some problem here, as I have no way of triggering the messages in real time(when both users online).
So I used setInterval with time interval of 1 second for real time communication and write a callback function to check if the Queue is empty else pop message from Queue and sent to user as an event.
Will the above solutions affect performance ? Because I am calling the function for each connected user x 1 second interval.
Long polling
In long polling how can I find if there is new message for user and complete the request ?
Still here setInterval should be used in server side but what about performance?
Websockets
In websockets we have an unique id to find the client in the pool of clients, so we can forward message to particular user when event occurs.
Still websockets uses some ping pong mechanism to make connection persistent but resource utilization is very small as they are network calls with comparatively small data and handled asynchronously so no wastage in server resource.
Questions
How to trigger res.write only when the new message arrives to particular user?
Does SSE+setInterval or longpolling+setInterval degrades performance when user increases?
Else is there any design pattern to achieve this functionality?
Simply use websocket.
It's fast, convinient and simple.
To send message in realtime when both users are logged, find second user by id in users Array or Map and send received message to his websocket.
If you have buffered messages for disconnected user (in memory/database/redis) check it when user connects and send if it exists.
I have React web application and REST API (Express.js).
I found that usage of EventStream is better choice if you do not want to use long-polling or sockets (no need to send data client->server).
Usecase:
User opens page where is empty table where other users can add data by POST /data.
This table is filled with initial data from API by GET /data.
Then page is connected to EventStream on /data/stream and listen for updates
Someone add new row and table needs to be updated...
Is possible to broadcast this change (new row added) from backend (controller for adding rows) to all users what are connected to /data/stream?
It is generally not good practice to have a fetch for the initial data, then a separate live stream for updates. That's because there is a window where data can arrive on the server between the initial fetch and the live update stream.
Usually, that means you either miss messages or you get duplicates that are published to both. You can eliminate duplicates by tracking some kind of id or sequence number, but that means additional coding and computation.
SSE can be used for both the initial fetch and the live updates on a single stream, avoiding the aforementioned sync challenges.
The client creates an EventSource to initiate an SSE stream. The server responds with the data that is already there, and thereafter publishes any new data that arrives on the server.
If you want, the server can include an event-id with each message. Then if a client becomes disconnected, the SSE client will automatically reconnect with the last-event-id, and the data flow resumes from where it left off. On the client-side, the auto-reconnect and resume from last-event-id is automatic as it is spec-ed by the standard. The developer doesn't have to do anything.
SSE is kind of like a HTTP / REST / XHR request that stays open and continues to stream data, so you get the best of both worlds. The API is lightweight, easy to understand, and standards-based.
I will try to answer myself :)
I never thought I can use just whatever pub/sub system on backend. Every user what connects to stream (/data/stream) gets subscribed and server will just publish when receive new row from POST /data
I am currently working on a web application using the MEAN stack. It has a social aspect to it so I want to be able to push notifications to users.
The way I do it now is when something happens that should be a notification it gets stored in a mongo database with an unread flag. Each client will send a get request to the server every 30 second and will receive every notification marked as unread, and will then mark it as read.
I want to switch to using a message queue and sockets so less network resources will be used, and also provide the user with a real-time experience. I've thought about using redis and its pubsub structure but I can't seem to figure out how to do this securely. If I push out notifications to the affected users, won't it be easy for someone malicious to subscribe to somebody else's channel and receive notifications not meant for them? Am I missing something or is it just the wrong approach for such a system?
Edit: Figure I update with the solution I went with if anyone else reading this is having the same problem.
Instead of using rabbitmq, as the answer suggested, I figured that a much more easy and elegant solution is to just use socket.io. When new sockets connects to the server I save a mapping from the userID to the socketId in a redis in-memory DB. (After I've validated their token) That way, if I need to push a notification to a user I just look up the socketId in the redis DB, and then send it to the correct socket.
This way I don't need any security beyond that as socketIDs are unguessable, and the message is only sent across the single socket that belongs to the given user.
This way it will only get sent through the connection of the given socket, as socketIDs are only used server side to keep track of all the connection. This means no one else can "listen" using someone else's socketID.
you can use RabbitMQ for this. Also authentication is there. Please go through following link and try.
https://www.rabbitmq.com/access-control.html
also, you can apply authentication in existing structure using subscription auth tokens with all subscribed users only.
even redis has its security with topics. Please have a look in link below
https://redis.io/topics/security
I have a java based web application(struts 1.2). I have a requirement to display a status on the frontend (jsp). Now the status might change which my server gets notified by another server. But I want this status change to be notified to the browser.
I don't want to make a refresh at intervals. Rather I have to implement something like done in gmail chat, ie. the browser gets notified by changing events on the server.
Any ideas on how to go about this?
I was thinking on lines of opening a request to server for status, and at the server end I would hold the request and wouldn't respond back until there is a status change. Any pointers, examples on this?
Best possible solution will be to make use of XMPP protocol. It's standardized and a lot of open source solutions will get you started within minutes. You can use combination of Smack, StropheJS and Openfire to get your java based app work as desired.
There's a method called Long Polling (Comet). It basically sends a request to the server. The request thread created on the server simply waits for new data for the user, with a time limit of maybe 1 minute or more. When new data is available it is returned.
The main problem is to tackle the server-side issue, you don't want to have one thread for every user just waiting for new data. Of course you could use some asynchronous methods depending on your back-end.
Ref: http://en.wikipedia.org/wiki/Push_technology
Alternative way would be to use WebSockets. The problem is that it's not supported by all browsers today.