EventStreams (SSE) - Broadcasting updates to clients. Is it possible?

EventStreams (SSE) - Broadcasting updates to clients. Is it possible? - node.js

I have React web application and REST API (Express.js).
I found that usage of EventStream is better choice if you do not want to use long-polling or sockets (no need to send data client->server).
Usecase:
User opens page where is empty table where other users can add data by POST /data.
This table is filled with initial data from API by GET /data.
Then page is connected to EventStream on /data/stream and listen for updates
Someone add new row and table needs to be updated...
Is possible to broadcast this change (new row added) from backend (controller for adding rows) to all users what are connected to /data/stream?

It is generally not good practice to have a fetch for the initial data, then a separate live stream for updates. That's because there is a window where data can arrive on the server between the initial fetch and the live update stream.
Usually, that means you either miss messages or you get duplicates that are published to both. You can eliminate duplicates by tracking some kind of id or sequence number, but that means additional coding and computation.
SSE can be used for both the initial fetch and the live updates on a single stream, avoiding the aforementioned sync challenges.
The client creates an EventSource to initiate an SSE stream. The server responds with the data that is already there, and thereafter publishes any new data that arrives on the server.
If you want, the server can include an event-id with each message. Then if a client becomes disconnected, the SSE client will automatically reconnect with the last-event-id, and the data flow resumes from where it left off. On the client-side, the auto-reconnect and resume from last-event-id is automatic as it is spec-ed by the standard. The developer doesn't have to do anything.
SSE is kind of like a HTTP / REST / XHR request that stays open and continues to stream data, so you get the best of both worlds. The API is lightweight, easy to understand, and standards-based.

I will try to answer myself :)
I never thought I can use just whatever pub/sub system on backend. Every user what connects to stream (/data/stream) gets subscribed and server will just publish when receive new row from POST /data

Related

How to use socket.io properly with express app

I wonder how do I use socket.io properly with my express app.
I have a REST API written in express/node.js and I want to use socket.io to add real-time feature for my app. Consider that I want to do something I can do just by sending a request to my REST API. What should I do with socket.io? Should I send request to the REST API and send socket.io client the result of the process or handle the whole process within socket.io emitter and then send the result to socket.io client?
Thanks in advance.

Question is not that clear but from what I'm getting from it, is that you want to know what you would use it for that you cant already do with your current API?
The short answer is, well nothing really.. Websockets are just the natural progression of API's and the need for a more 'real-time' interface between systems.
Old methods (and still used and relevant for the right use case) is long polling where you keep checking back to the server for updated items and if so grab them.. This works but it can be expensive in terms of establishing a connection, performing a lookup, then closing a connection.
websockets keep that connection open, allowing both the client and server to communicate real time. So for example, lets say you make an update to your backend data and want users to get that update, using long polling you would rely on each client to ping back to the server, check if there is an update and if so grab it. This can cause lags between updates, some users have updated data while other do not etc.
Now, take the same scenario with websockets, you make an update to the backend data, hit submit, this then emits to your socket server. Socket server takes the call, performs the task ( grabs updated data ) and emits it to the users, each connected user instantly gets that update.
Socket servers are typically used for things like real time chats or polling where packets are smaller but they are also used for web games etc. Depending on the size of your payloads will determine how best to send data back and forth because the larger the payload the more resources / bandwidth it will take on the socket server so its something to consider.

performance of real time chatting techniques

I assigned myself with the task of implementing the chat app (1:1) for my curriculum.Among the various options I used SSE for real time chats.From the example projects I am able to implement the non persistent chat between two clients.In every examples they uses js object and array to store the res object and by iterating them they sent events to particular user.But when implementing the real time chat app the users may increase dramatically So it is not good to exhaust server resources.
I found the some of the other ways to achieve same
functionality but not sure about the performance
SSE+setInterval
I used redis Queue to push offline messages to the user.
when the user establishes the connection push all the unread chats to client.
This process happens immediately when client establishes connection with server.
I faced some problem here, as I have no way of triggering the messages in real time(when both users online).
So I used setInterval with time interval of 1 second for real time communication and write a callback function to check if the Queue is empty else pop message from Queue and sent to user as an event.
Will the above solutions affect performance ? Because I am calling the function for each connected user x 1 second interval.
Long polling
In long polling how can I find if there is new message for user and complete the request ?
Still here setInterval should be used in server side but what about performance?
Websockets
In websockets we have an unique id to find the client in the pool of clients, so we can forward message to particular user when event occurs.
Still websockets uses some ping pong mechanism to make connection persistent but resource utilization is very small as they are network calls with comparatively small data and handled asynchronously so no wastage in server resource.
Questions
How to trigger res.write only when the new message arrives to particular user?
Does SSE+setInterval or longpolling+setInterval degrades performance when user increases?
Else is there any design pattern to achieve this functionality?

Simply use websocket.
It's fast, convinient and simple.
To send message in realtime when both users are logged, find second user by id in users Array or Map and send received message to his websocket.
If you have buffered messages for disconnected user (in memory/database/redis) check it when user connects and send if it exists.

Is it possible to share 3rd party API websocket/realtime connection through GraphQL/REST backend?

My backend server connects to a third party API service. This third party API service sends events via a websocket to my backend, which then relays back to my frontend. Every message sent from the websocket, I pay for.
In addition, the websocket provides different data depending on which value is requested. An example: Imagine an API service that provides real-time payments for different products. If I want to receive real-time payments for books, I'd use the value "books". And if I'd want to receive real-time payments for iPhones, I'd use the value "iPhones", etc.
To minimise cost spend, I'd preferably want users connecting through my backend to share the stream of information, rather than creating a new websocket connection every time. And for connections that are not in use, to be disconnected from the third party websocket. I.e. If a user were to request "book"s through my backend, a connection would be opened to the third party API & then subsequently streamed back. If another user were to also request "book"s they'd just hop onto the existing opened connection.
Is this a possibility with GraphQL & how'd would you imagine this being implemented? Just off the top of my head, I'd imagine some sort of tracking on the backend side which checks what connections are opened & whether they're being used. Otherwise, is it possible in REST?

You could do something very similar with Apollo's polling features with a little custom logic thrown in.
Basically, you could open a connection to the third-party using polling as a socket to update the data layer. Then, when another user queries that data layer, it should already exist (or you can add a quick check on top of that).

Sending messages between clients socket.io

I'm working on a chat application and using socket.io / node for that. Basically I came up with the following strategies:
Send message from the client which is received by the socket server which then sends it to the receiving client. On the background I store that to the message on the DB to be retrieved later if the user wishes to seee his old conversations.
The pros of this approach is that the user gets the message almost instantly since we don't wait for the DB operation to complete, but the con is that if the DB operation failed and exactly that time the client refreshed its page to fetch the message, it won't get that.
Send message form the client to the server, the server then stores it on the DB first and then only sends it to the receiving client.
The pros is that we make sure that the message will be received to the client only if its stored in the DB. The con is that it will be no way close to real time since we'll be doing a DB operation in between slowing down the message passing.
Send message to the client which then is stored on a cache layer(redis for example) and then instantly broadcast it to the receiving client. On background keep fetching records from redis and updating DB. If the client refreshes the page, we first look into the DB and then the redis layer.
The pros is that we make the communication faster and also make sure messages are presented correctly on demand. The con is that this is quite complex as compared to above implementations, and I'm wondering if there's any easier way to achieve this?
My question is whats the way to go if you're building a serious chat application that ensures both - faster communication and data persistence. What are some strategies that app like facebook, whatsapp etc. use for the same? I'm not looking for exact example, but a few pointers will help.
Thanks.

I would go for the option number 2. I've been doing myself Chat apps in node and I found out that this is the best option. Saving in a database takes few milliseconds, which includes the 0.x milliseconds to write in the databse and the few milliseconds of latency in communication ( https://blog.serverdensity.com/mongodb-benchmarks/ ).
SO I would consider this approach realtime. The good thing with this is that if it fails, you can display a message to the sender that it failed, for whatever reason.
Facebook, whatsapp and many other big messaging apps are based on XMPP (jabber) which is a very, very big protocol for instant messaging and everything is very well documented on how to do things but it is based in XML, so you still have to parse everything etc but luckily there are very good libraries to handle with xmpp. So if you want to go the common way, using XMPP you can, but most of the big players in this area are not following anymore all the standards, since does not have all the features we are used to use today.
I would go with doing my own version, actually, I already something made (similar to slack), if you want I could give you access to it in private.
So to end this, number 2 is the way to go (for me). XMPP is cool but brings also a lot of complexity.

RESTful backend and socket.io to sync

Today, i had the idea of the following setup. Create a nodejs server along with express and socket.io. With express, i would create a RESTful API, which is connected to a mongo. BackboneJS or similar would connect the client to that REST API.
Now every time the mongodb(ie the data in it iam interested in) changes, socket.io would fire an event to the client, which would carry a courser to the data which has changed. The client then would trigger the appropriate AJAX requests to the REST to get the new data, where it needs it.
So, the socket.io connection would behave like a synchronize trigger. It would be there for the entire visit and could also manage sessions that way. All the payload would be send over http.
Pros:
REST API for use with other clients than web
Auth could be done entirely over socket.io. Only sending token along with REST requests.
Use the benefits of REST.
Would also play nicely with pub/sub service like Redis'
Cons:
Greater overhead, than using pure socket.io.
What do you think, are there any great disadvantages i did not think of?

I agree with #CharlieKey, you should send the updated data rather than re-requesting.
This is exactly what Tower is doing:
save some data: https://github.com/viatropos/tower/blob/development/src/tower/model/persistence.coffee#L77
insert into mongodb (cursor is a query/persistence abstraction): https://github.com/viatropos/tower/blob/development/src/tower/model/cursor/persistence.coffee#L29
notify sockets: https://github.com/viatropos/tower/blob/development/src/tower/model/cursor/persistence.coffee#L68
emit updated records to client: https://github.com/viatropos/tower/blob/development/src/tower/server/net/connection.coffee#L62
The disadvantage of using sockets as a trigger to re-request with Ajax is that every connected client will have to fetch the data, so if 100 people are on your site there's going to be 100 HTTP requests every time data changes - where you could just reuse the socket connections.

I think that pushing the updated data with the socket.io event would be better than re-requesting the lastest. Even better you could only push the modified pieces of data decreasing the amount of data sent over the line. Overall though a interesting idea.

I'd look into Now.js since it does pretty much exactly what you need.
It creates a namespace which is shared among the client and server. The server can call functions on the client directly and vice versa.
That is if you insist on your current infrastructure decision to use MongoDB and Node.js, otherwise there would be CouchDB which is a full web server and document database with sophisticated replication mechanisms built-in.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string