Socket.io offline behaviour server-side - node.js

I have run into an unforeseen problem with my socket.io setup.
I use socket.io to live load data from my database (mongoDB, nodejs, react).
To accomplish this, I use mongoDB's changestream to detect changes and then push them to the front-end via socket.io.
Now this works perfectly as long as the user is connected. And right now, when the user reconnects, it just reloads all data. While this is fine for most users, there is a small group with very bad network connection and thus the front-end is reloading data all the time. Which causes the front-end to be unresponsive for some time.
So, I am looking for a way to only send events that occurred during the front-end being offline. While the front-end can do this quite easily: https://socket.io/docs/v4/client-offline-behavior/
It doesn't seem possible to do this at the server side. Since socket.io (server side) immediately forgets sockets that have disconnected and thus cant buffer events.
So, I was wondering if there is a good way do this? Or would this need a full "wrapper" around socket.io that caches disconnected sockets?
Any help or advice would be appreciated!

I find it is a really interesting and painful problem ! ^^'
If you can give more variables, it may help people to give you a better answer
For instance
How many data are stored in database, how much a typical user will receive, and how many events are triggered on a time frame ?
How long should an event take to be visible ? I mean, if users receive an event with a 10s,30s,... delay, is it harmfull for the service they provide.
How your data is structured ? is it a simple json array with the same field, custom field, dynamic json object, etc..
How your react app is structured, do you put heavy logic when your data is update, etc..
I think you should put more controls in your front end code and update only when new datas.
Some paths to explore
1. Put more controls in your front end
As you stated, for the users with bad connection, the react client seems to update his state too quickly, when they reload data after the websocket is connected, again and again. Ui may freeze in this case, yes.
For this, I think of two approaches :
Before updating the state, check if react current state is the same as the data you receive from websocket connection. If the reconnection is quick enough and no new data arrived, it should be the same. So in this case do not update react state.
If too many events are triggered and after each reconnection new data arrived, you can buffer the datas from the websocket and display it only once per time frame. What i mean by time frame, is you can use functions like setInterval or requestAnimationFrame to trigger react update. A pseudo react code to illustrate this.
function App() {
const [events, setEvents] = useState({ datas: [] });
const bufferedEvents = useRef([]);
useEffect(() => {
websocket.on("connected", (newEvents) => {
bufferedEvents.current = bufferedEvents.current.concat(newEvents);
})
websocket.on("data", (newEvent) => {
bufferedEvents.current = bufferedEvents.current.concat(newEvent);
})
// In the setInterval function you take all the events receive at the connection + new events. to update the react state. You clean the bufferedEvents at the same time.
const intervalId=setInterval(() => {
const events = bufferedEvents.current;
bufferedEvents.current = [];
//update if new datas
if (events.length > 0) {
setEvents((prevState) => { return { datas: prevState.datas.concat(events) } });
}
// console.log(events)
}, 1000) // trigger data update every second. You could replace this approach with a requestAnimationFrame. You can adapt the time refresh as you need.
//Do not forget to clear the interval when the component is unmount
return ()=>{
clearInterval(intervalId)
}
}, []);
return (
<div>
<span>Total events : {events.datas.length}</span>
<br />
{
events.datas.map(event => {
return <div>{event.data}</div>
})
}
</div>
)
}
You can look at this article for details on using requestAnimation frame.
I think that modifying the front end is needed in all case, but still alone, not really good on performance.
2. Fetch only new data in your back end
For this approach, it really depends how your data is structured in the database.
If the data have some timestamp in it, I can think of a naive but simple cookie with a timestamp in it.
When user connects the first time, this cookie is null.
When they fetch the data, on the websocket connection, they receive all the datas. When datas arrived, you update the cookie timestamp with the most recent date in the data.
Websocket is disconnected, you open a new websocket with the cookie timestamp on it. With this information you can query all the datas more recent than the timestamp on the cookie.
Like this, you don't have to download the entirity of data, but only fresh ones.
Other approaches may be more helpfull but without more informations on your datas and more precise requirements, it is hard to say.
If you have a lot of data, I will personally check some pagination mechanism and maybe combine some classic http request for fetching the data, and websocket, sse, or long polling for live events.
You can put a comment if needed and I will update my response !
Cheers

Related

Change Socket for another user

I'm trying to develop an API for multiplayer online using socket programming in node js
I have some basic questions:
1. How to know which connection is related to a user?
2. How to create a socket object related to another person?
3. When it's opponent turn, how to make an event?
4. There is a limited time for move, how to handle the time to create an event and change turn?
As it is obvious I don't know how to handle users and for example list online users
If you can suggest some articles or answering these questions would be greate
Thanks
Keep some sort of data structure in memory where you are saving your sockets to. You may want to wrap the node.js socket in your own object which contains an id property. Then you can save these objects into a data structure saved in memory.
class User {
constructor(socket) {
this.socket = socket;
this.id = //some random id or even counter?
}
}
Then save this object in memory when you get a new socket.
const sockets = {}
server = net.createServer((socket) => {
const user = new User(socket);
sockets[user.id] = user
})
I am unsure what you mean by that, but maybe the above point helps out
This depends on when you define a new turn starts. Does the new turn start by something that is triggered by another user? If so use your solution to point 2 to relay that message to the related user and write something back to that socket.
Use a timeout. Maybe give your User class an additional property timeout whenver you want to start a new timeout do timeout = setTimeout(timeouthandler,howlong) If the timeouthandler is triggered the user is out of time, so write to the socket. Don't forget to cancel your timeouts if you need to.
Also, as a side note, if you are doing this with pure node.js tcp sockets you need to come up with some ad-hoc protocol. Here is why:
socket.on("data", (data) => {
//this could be triggered multiple times for a single socket.write() due to the streaming nature of tcp
})
You could do something like
class User {
constructor(socket) {
this.socket = socket;
this.id = //some random id or even counter?
socket.on("data", (data) => {
//on each message you get, find out the type of message
//which could be anything you define. Is it a login?
// End of turn?
// logout?
})
}
}
EDIT: This is not something that scales well. This is just to give you an idea on what can be done. Imagine for some reason you decide to have one node.js server instance running for hundreds of users. All those users socket instances would be stored in the servers memory

Returning multiple asynchronous responses

I'm currently looking to set up an endpoint that accepts a request, and returns the response data in increments as they load.
The application of this is that given one upload of data, I would like to calculate a number of different metrics for that data. As each metric gets calculated asynchronously, I want to return this metric's value to the front-end to render.
For testing, my controller looks as follows, trying to use res.write
uploadData = (req, res) => {
res.write("test");
setTimeout(() => {
res.write("test 2");
res.end();
}, 3000);
}
However, I think the issue stems from my client-side which I'm writing in React-Redux, and calling that route through an Axios call. From my understanding, it's because the axios request closes once receiving the first response, and the connection doesn't stay open. Here is what my axios call looks like:
axios.post('/api', data)
.then((response) => {
console.log(response);
})
.catch((error) => {
console.log(error);
});
Is there an easy way to do this? I've also thought about streaming, however my concern with streaming is that I would like each connection to be direct and unique between clients that are open for short amount of time (i.e. only open when the metrics are being calculated).
I should also mention that the resource being uploaded is a db, and I would like to avoid parsing and opening a connection multiple times as a result of multiple endpoints.
Thanks in advance, and please let me know if I can provide any more context
One way to handle this while still using a traditional API would be to store the metrics in an object somewhere, either a database or redis for example, then just long poll the resource.
For a real world example, say you want to calculate the following metrics of foo, time completed, length of request, bar, foobar.
You could create an object in storage that looks like this:
{
id: 1,
lengthOfRequest: 123,
.....
}
then you would create an endpoint in your API that like so metrics/{id}
and would return the object. Just keep calling the route until everything completes.
There are some obvious drawbacks to this of course, but once you get enough information to know how long the metrics will take to complete on average you can tweak the time in between the calls to your API.

How do I manage groups/rooms with node WebSockets?

TL;DR below.
I am currently developing a React/Redux SPA that is driven by real-time data. I've decided to use ws, instead of socket.io since socket.io feels a bit high level for what I'm doing, I'd rather manage sockets myself.
In saying that, I'm struggling to find a way to manage the separation of updates/messages per view/route. Since I'm using client-side routing it's per express route won't really work...
Messages between the server and client via WebSockets are JSON with actions like GET_ITEMS then a response of GET_ITEMS_SUCCESS with an array of 'items' and for errors: ..._ERROR etc. This is all fine, since it's just 1 to 1 transaction. Though the problem arises when broadcasting (1 to all) to all relevant clients when the server receives an update.
So, I assume it best practice to limit these broadcasts to the clients that are viewing/want the data. So when viewing, for example, the Item page, there is no point in broadcasting updates to the User data since that is only used on the User page.
I haven't been able to find any common practices when dealing with this sort of situation, just a few small outdated/barely used wrappers for ws that just add a few basic functions to leave/join but don't offer much flexibility with implementation.
What I think MIGHT work is to have an object/array for each 'group'/'room', which stores the clients that are currently listening to updates from a given section. So a user would send an action to INIT_LISTEN (& ``) with a param of category, e.g. ITEM for updates and other actions related to items.
TL;DR
What my question really boils down to is: How do I store a reference to a single socket? (ws client object? ws client ID?) Then, can I store this in an object/array to iterate through like below.
const ClientRooms = {
Items: {
{
...ws
}
/* ...rest of the client */
}
}
or
const ClientRooms = {
Items: [ "xyz" ] /* Array of ws ids */
}
I have a "ping--pong" heartbeat function to keep clients active and prevent silent connection failures/disconnections. I can't find if ws.terminate() still fires the ws close event so I can iterate 'group'/'room' the object/array to find and remove instances of that client.

How can I simulate latency in Socket.io?

Currently, I'm testing my Node.js, Socket.io server on localhost and on devices connected to my router.
For testing purposes, I would like to simulate a delay in sending messages, so I know what it'll be like for users around the world.
Is there any effective way of doing this?
If it's the messages you send from the server that you want to delay, you can override the .emit() method on each new connection with one that adds a short delay. Here's one way of doing that on the server:
io.on('connection', function(socket) {
console.log("socket connected: ", socket.id);
// override the .emit() method
const emitFn = socket.emit
socket.emit = (...args) => setTimeout(() => {
emitFn.apply(socket, args)
}, 1000)
// rest of your connection handler here
});
Note, there is one caveat with this. If you pass an object or an array as the data for socket.emit(), you will see that this code does not make a copy of that data so the data will not be actually used until the data is sent (1 second from now). So, if the code doing the sending actually modifies that data before it is sent one second from now, that would likely create a problem. This could be fixed by making a copy of the incoming data, but I did not add that complexity here as it would not always be needed since it depends upon how the caller's code works.
An old but still popular question. :)
You can use either "iptables" or "tc" to simulate delays/dropped-packets. See the man page for "iptables" and look for 'statistic'. I suggest you make sure to specify the port or your ssh session will get affected.
Here are some good examples for "tc":
http://www.linuxfoundation.org/collaborate/workgroups/networking/netem

Resolve MongoDB reference

I am currently building a chatting app with nodejs and mongoDB.
Basically I have two collections to maintain in the db.
user = {
_id: ObjectId("1234"),
account: "stan123"
}
thread = {
_user: ObjectId("1234"),
messages: [
{
body:"hi"
_user:ObjectId("1234")
},
{
body:"second msg"
_user:ObjectId("1234")
}
]
}
I am planning to pass the thread model with all resolved info (user) to the client side, so that I can construct my widget with it.
I searched for solutions for this.Some suggests to make extra calls from client side to get the data.
However, I am worried that when the amount of message grows, there will be considerable http calls that might hurt site speed.
I know some drivers can resolve DBRefs automatically and make the code clean.
However, according to
http://docs.mongodb.org/manual/applications/database-references/
I decided to just use id to maintain reference that make it's as simple as possible.
My plan is resolving all references on server side. Current approach is getting the length of message array first.
Then loop through the message array and make a second query to resolve user info separately.
In each query callback, do a messageToResolve++ and if(messageToResolve >= thread.messages.length)
If the condition meets, send the resolved model to client and end the response.
This is not a case I would consider embedded because it would be painful when you need to update user data.
(message is embedded because it exists only when thread exists)
I am not sure if it's a good way to do it.
Does anyone has a better solution?
Sorry if I didn't explain my problem and solution clear enough.
And thanks in advance.

Resources