Background
I have a monolith Node.js + PostgreSQL app that, besides other things, needs to provide real-time in-app notifications to end users.
It is currently implemented in the following way:
there's a db table notifications which has state (pending/sent), userid (id of the notification receiver), isRead (did a user read the notification), type and body - notification data.
once specific resources get created or specific events occur, a various number of users should receive in-app notifications. When a notification is created, it gets persisted to the db and gets sent to the user using WebSockets. Notifications can also get created by a cron job.
when a user receives N number of notifications of the same type, they get collapsed into one single notification. This is done via db trigger by deleting repeated notifications and inserting a new one.
usually it works fine. But when the number of receivers exceeds several thousands, the app lags or other requests get blocked or not all notifications get sent via WebSockets.
Examples of notifications
Article published
A user is awarded with points
A user logged in multiple times but didn't perform some action
One user sends a friend request to another
One user sent a message to another
if a user receives 3+ Article published notifications, they get collapsed into the N articles published notification (N gets updated if new same notifications get received).
What I currently have doesn't seem to work very well. For example, for the Article created event the api endpoint that handles the creation, also handles notifications send-outs (which is maybe not a good approach - it creates ~5-6k notifications and sends them to users via websockets).
Question
How to correctly design such functionality?
Should I stay with a node.js + db approach or add a queuing service? Redis Pub/Sub? RabbitMQ?
We deploy to the k8s cluster, so adding another service is not a problem. More important question - is it really needed in my case?
I would love some general advice or resources to read on this topic.
I've read several articles on messaging/queuing/notifications system design but still don't quite get if this fits my case.
Should the queue store the notifications or should they be in the db? What's the correct way to notify thousands of users in real-time (websockets? SSE?)?
Also, the more I read about queues and message brokers, the more it feels like I'm overcomplicating things and getting more confused.
Consider using the Temporal open source project. It would allow modeling each user lifecycle as a separate program. The Temporal makes the code fully fault tolerant and preserves its full state (including local variables and blocking await calls) across process restarts.
Related
I'm building an application in which we have worked on Payment gateway named flutterwave.
And now the scenario is on every success or failure of a payment, I receive a webhook and then we take further actions such as sending emails, SMS and updating the statuses of the payment in the DB.
For now, we have implemented polling in the client side and for a particular time span if the client receives a status (success or fail) we show it otherwise they can check later it in the payment history page.
Now we want to remove this polling and update users in real time about the success or failure of a payment.
What are the ways by which we can achieve this?
The questions are how we will notify a specific user about the same as we have a multiplatform app and the same user can be logged in different platforms.
What you are looking for is a real-time communication pattern with WebSockets a layer 7 protocol in the OSI model which offers bi-directional communication.
This means that you can establish communication between your servers and your user's browser (client). As a result, you can send notification data to the client and consume and react to the notification, by showing visual cues in your UI for the user to see.
Some examples of implementing WebSockets with Socket.io and Nodejs: https://dev.to/novu/sending-real-time-notifications-with-socketio-in-nodejs-1l5j
There are also paid services that can offer this functionality like Pusher, and I would actually recommend that route at the beginning so you can avoid spending too much time implementing this and focus more on the stuff that matters and is part of your roadmap.
Additionally, you can use Push Notifications as another way to notify your users even when they are not using the app.
One of the requirements of my app is that when one user makes any insert/update/delete, all users viewing a page with a list of that record type get pushed an update containing the change. The user should not be expected to repeat an API call to refresh the dozens of records that did not change, because the push should contain a short summary of the change that occurred.
I accomplished this in my small dev server using SocketIO. I can't scale this across more than one server. My target infrastructure is AWS, and I know AWS has a push notification service, but I believe it's mobile-only and not what I'm looking for. The huge number of data streams being subscribed to is the reason I haven't consider a server-less infrastructure.
I'm new to AWS and have never attempted horizontal scaling either, so please forgive me if my entire question is ignorant.
Have you taken a look at using AWS IoT MQTT messaging protocol? Each browser is a 'device' and you have javascript listening in the browser for messages published via a socket protocol. Each service pushes a message to MQTT when it has an update. There's some good POCs out there (i.e. medium.com/#jparreira/…)
My application stack is ios(front-end) and node.js(back-end). I have to send notification to devices. In my node.js part im using apns module to send notification, its working fine......
Now i have to send Mass notification like at a time consider i have 10,000 devices to notified, the logic what im following is
I'm looping through 10,000 devices and calling apns provider.
1.Why this for loop approach
I have to store each notification details in my mongodb collection, so i followed this approach.
The problem is the notification is received by some devices and that too very late(next day).
I read the link also
https://www.raywenderlich.com/156966/push-notifications-tutorial-getting-started
saying apns will reject.
Is the above approach is correct also any way to make all notification deliverer.
Please share your ideas. Thanks in advance.
If you need to process each individual notification before/after it is sent I would instead recommend a design change from a loop and have you look at job queue instead.
With this design pattern, instead of your only step being to loop over notifications and send via APN, you push these notification into a queue/messaging system and have workers which pull from the queue and process (send via APN and write to mongo) the notifications. The nice part of this design is that as your application grows you can add on more workers to handle the increased load without rewriting your application/architecture. Once you have it built it may look something like this:
I personally use RabbitMQ for my job queue, but that decision is something you need to research on your own. For example if you don't want to manage the messaging system you could look into something like AWS Simple Queue Service.
I think looping through 10,000 devices ids and calling APNS provider is not the right way forward. The documentations strictly says here node-apn readme file to reuse apn.Provider rather than recreate it every time to achieve the best possible performance.
If you send notification using arrays of device ids rather than just a device id then you will get a response from the APNS mentioning all the details for each device.
So i currently have a chat system running NodeJS that passes messages via rabbit and each connected user has their own unique queue that subscribed and only listening to messages (for only them). The backend can also use this chat pipeline to communicate other system messages like notifications/friend requests and other user event driven information.
Currently the backend would have to loop and publish each message 1 by 1 per user even if the payload of the message is the same for let's say 1000 users. I would like to get away from that and be able to send the same message to multiple different users but not EVERY user who's connected.
(example : notifying certain users their friend has come online).
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
Proposed - infrastructure
This way the backend does not need to loop for 100s and 1000s of users and can send a single payload containing all users this message should go to. I do plan to cluster the nodejs servers together.
I was also wondering since ive never done this in a production environment, will i need to track each socketID.
Potential pitfalls i've identified so far:
slower since 1000s of messages can pile up in a single queue.
manually storing socket IDs to manually trasmit to users.
offloading routing to NodeJS instead of RabbitMQ
Has anyone done anything like this before? If so, what are your recommendations. Is it better to scale with user unique queues, or pool all grouped messages for all users into smaller (but larger pools) of queues.
as a general rule, queue-per-user is an anti-pattern. there are some valid uses of this, but i've never seen it be a good idea for a chat app (in spite of all the demos that use this example)
RabbitMQ can be a great tool for facilitating the delivery of messages between systems, but it shouldn't be used to push messages to users.
I considered implementing a rabbit queue system where all messages are pooled into the same queue and instead of rabbit sending all user queues node takes these messages and emit's the message to the appropriate user via socket connections (to whoever is online).
this is heading down the right direction, but you have to remember that RabbitMQ is not a database (see previous link, again).
you can't randomly seek specific messages that are sitting in the queue and then leave them there. they are first in, first out.
in a chat app, i would have rabbitmq handling the message delivery between your systems, but not involved in delivery to the user.
your thoughts on using web sockets are going to be the direction you want to head for this. either that, or Server Sent Events.
if you need persistence of messages (history, search, last-viewed location, etc) then use a database for that. keep a timestamp or other marker of where the user left off, and push messages to them starting at that spot.
you're concerns about tracking sockets for the users are definitely something to think about.
if you have multiple instances of your node server running sockets with different users connected, you'll need a way to know which users are connected to which node server.
this may be a good use case for rabbitmq - but not in a queue-per-user manner. rather, in a binding-per-user. you could have each node server create a queue to receive messages from the exchange where messages are published. the node server would then create a binding between the exchange and queue based on the user id that is logged in to that particular node server
this could lead to an overwhelming number of bindings in rmq, though.
you may need a more intelligent method of tracking which server has which users connected, or just ignore that entirely and broadcast every message to every node server. in that case, each server would publish an event through the websocket based on the who the message should be delivered to.
if you're using a smart enough websocket library, it will only send the message to the people that need it. socket.io did this, i know, and i'm sure other websocket libraries are smart like this, as well.
...
I probably haven't given you a concrete answer to your situation, and I'm sure you have a lot more context to consider. hopefully this will get you down the right path, though.
I'm working on a chat application and using socket.io / node for that. Basically I came up with the following strategies:
Send message from the client which is received by the socket server which then sends it to the receiving client. On the background I store that to the message on the DB to be retrieved later if the user wishes to seee his old conversations.
The pros of this approach is that the user gets the message almost instantly since we don't wait for the DB operation to complete, but the con is that if the DB operation failed and exactly that time the client refreshed its page to fetch the message, it won't get that.
Send message form the client to the server, the server then stores it on the DB first and then only sends it to the receiving client.
The pros is that we make sure that the message will be received to the client only if its stored in the DB. The con is that it will be no way close to real time since we'll be doing a DB operation in between slowing down the message passing.
Send message to the client which then is stored on a cache layer(redis for example) and then instantly broadcast it to the receiving client. On background keep fetching records from redis and updating DB. If the client refreshes the page, we first look into the DB and then the redis layer.
The pros is that we make the communication faster and also make sure messages are presented correctly on demand. The con is that this is quite complex as compared to above implementations, and I'm wondering if there's any easier way to achieve this?
My question is whats the way to go if you're building a serious chat application that ensures both - faster communication and data persistence. What are some strategies that app like facebook, whatsapp etc. use for the same? I'm not looking for exact example, but a few pointers will help.
Thanks.
I would go for the option number 2. I've been doing myself Chat apps in node and I found out that this is the best option. Saving in a database takes few milliseconds, which includes the 0.x milliseconds to write in the databse and the few milliseconds of latency in communication ( https://blog.serverdensity.com/mongodb-benchmarks/ ).
SO I would consider this approach realtime. The good thing with this is that if it fails, you can display a message to the sender that it failed, for whatever reason.
Facebook, whatsapp and many other big messaging apps are based on XMPP (jabber) which is a very, very big protocol for instant messaging and everything is very well documented on how to do things but it is based in XML, so you still have to parse everything etc but luckily there are very good libraries to handle with xmpp. So if you want to go the common way, using XMPP you can, but most of the big players in this area are not following anymore all the standards, since does not have all the features we are used to use today.
I would go with doing my own version, actually, I already something made (similar to slack), if you want I could give you access to it in private.
So to end this, number 2 is the way to go (for me). XMPP is cool but brings also a lot of complexity.