Storing data in arrays in node.js - node.js

I'm experimenting node.js building a chat server app so i'm trying to maintain client connections in arrays, incrementing index for every new connection inserting the client object in an array.
I don't like this solution because i think it's too much light, i put holes in arrays for every disconnection and find long to iterate in arrays for finding a certain connection.
Which is the "node" way to handle multiple connections?

For the record, my understanding here is that you're trying to track clients to know who to broadcast to? I think this based on a previous question you asked that I saw.
I believe that this is similar to how socket.io handles rooms internally. It's been a while since I actually looked into it, but I believe this is how it's done. That being said, what I've done in the past is use rooms for each chat. Something along the lines to a concatenation of the usernames/userid's of the members. This way when you get data on a connection, you can easily broadcast it back to the same room, which will send it to all connections in the room except for the one it received it from. Perfect for a chat application. Socket.io will handle tracking which connections are in which room.

Related

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Node.js sockets.io need for synchronization?

I'm currently working with node.js, using the socket.io library, to implement a simple chat application. In this applicatio, for irrelevant reasons, I want to setup a system in which a client can ask for a piece of information to the server. The server then broadcasts this request to all other online sockets which will respond with the the answer if they have it. The server then finally returns (the first) response it receives to this original client socket that made the request.
Naturally, the client might receive multiple responses, while only one is needed. Therefor, as soon as one has been received, the others should be discarded. However, it feels like I should use some kind of synchronized datastructure/code to make sure this check for "If an answer has already been received" works as intended.
I've done some searching on this subject but I've seen several mentions of node.js using an event-driven model and not requiring any synchronized code/datastructures, as there are no multiple threads. Is this true? Would my scenario not require any kind of special attention to synchronization and would it just work? Or would I need to use some synchronization methods and if so, which ones?
Code example:
socket.on('new_response', async data => {
await processResponse(data)
});
Due to the fact I am working with encryption I have to make use of async/await, which further complicates things. The processResponse function does a check whether a response has been received already, if not, it processes it, else, it ignores it.
I would suggest something as simple as including a uniqueID in each broadcast to the clients asking if they have a piece of information. The clients then include that same uniqueID in any response they send.
With that, your server can receive answers from the clients and just keep track of which uniqueID values it has already received an answer for and then, if an answer has already been received for that uniqueID, then it would just ignore the later clients that respond.
The uniqueID is server-side generated so it can literally just be an increasing number. You can store the numbers used so far in a server-side Set object so you can quickly look up if you've already received a response for that uniqueID.
Then, the only thing left to do is to age these uniqueIDs away in the Set at some point so they don't accumulate forever. A simple way to do that would be just replace the Set object with a second one every 15 minutes or so, keeping one older generation around so you can check both of them.

NodeJS and socket.io messages dispatching best practices

We are in the process of writing a multiplayer game with NodeJS and websockets. When designing the architecture of the server we came up with the idea of objects handling most of their own communication.
First example on a connection we would then pass the socket to the client object who could then handle himself the communications. One of the advantage is not having to have a global handler that has to route the messages. This makes the responsibilities very well separated.
I can then also tell the client themselves to join a specific room with the socket.io functionality by sending them the room id.
The main disadvantage we can see is that there may be occurrence were we need to duplicate the sockets. ie: if a client disconnects from a room and wants to join another.
So the other option is going a more classical way and handling all the messages globally and then routing the with associated id's to the concerned objects.
Would you guys advise one solution over the other and why?

Sending messages between clients socket.io

I'm working on a chat application and using socket.io / node for that. Basically I came up with the following strategies:
Send message from the client which is received by the socket server which then sends it to the receiving client. On the background I store that to the message on the DB to be retrieved later if the user wishes to seee his old conversations.
The pros of this approach is that the user gets the message almost instantly since we don't wait for the DB operation to complete, but the con is that if the DB operation failed and exactly that time the client refreshed its page to fetch the message, it won't get that.
Send message form the client to the server, the server then stores it on the DB first and then only sends it to the receiving client.
The pros is that we make sure that the message will be received to the client only if its stored in the DB. The con is that it will be no way close to real time since we'll be doing a DB operation in between slowing down the message passing.
Send message to the client which then is stored on a cache layer(redis for example) and then instantly broadcast it to the receiving client. On background keep fetching records from redis and updating DB. If the client refreshes the page, we first look into the DB and then the redis layer.
The pros is that we make the communication faster and also make sure messages are presented correctly on demand. The con is that this is quite complex as compared to above implementations, and I'm wondering if there's any easier way to achieve this?
My question is whats the way to go if you're building a serious chat application that ensures both - faster communication and data persistence. What are some strategies that app like facebook, whatsapp etc. use for the same? I'm not looking for exact example, but a few pointers will help.
Thanks.
I would go for the option number 2. I've been doing myself Chat apps in node and I found out that this is the best option. Saving in a database takes few milliseconds, which includes the 0.x milliseconds to write in the databse and the few milliseconds of latency in communication ( https://blog.serverdensity.com/mongodb-benchmarks/ ).
SO I would consider this approach realtime. The good thing with this is that if it fails, you can display a message to the sender that it failed, for whatever reason.
Facebook, whatsapp and many other big messaging apps are based on XMPP (jabber) which is a very, very big protocol for instant messaging and everything is very well documented on how to do things but it is based in XML, so you still have to parse everything etc but luckily there are very good libraries to handle with xmpp. So if you want to go the common way, using XMPP you can, but most of the big players in this area are not following anymore all the standards, since does not have all the features we are used to use today.
I would go with doing my own version, actually, I already something made (similar to slack), if you want I could give you access to it in private.
So to end this, number 2 is the way to go (for me). XMPP is cool but brings also a lot of complexity.

Using Backbone.iobind (socket.io) with a cluster of node.js servers

I'm using Backbone.iobind to bind my client Backbone models over socket.io to the back-end server which in turn store it all to MongoDB.
I'm using socket.io so I can synchronize changes back to other clients Backbone models.
The problems starts when I try to run the same thing over a cluster of node.js servers.
Setting a session store was easy using connect-mongo which stores the session to MongoDB.
But now I can't inform all the clients on every change, since the clients are distributed between the different node.js servers.
The only solution I found is to set a pub/sub queue between the different node.js servers (e.g. mubsub), which seems like a very heavy weight solution that will trigger an event on all the servers for every change.
How did you reach the conclusion that pub/sub is a "very heavy weight solution"?
Sounds like you got it right up until that part :-)
Oh, and pub/sub is not a queue.
Let's examine that claim:
The nice thing about pub/sub is that you publish and subscribe to channels/topics.
So, using the classic chat server example, let's say you have a million users connected in total, but #myroom only has 50 users in it.
When a message is sent to #myroom, it's being published once. No duplication whatsoever.
In most use-cases you won't even need to store it on disk/RAM, so we're mostly looking at network/bandwidth here. And, I mean, you're probably throwing more data (probably over the wire?) to MongoDB already, so I assume that's not your bottleneck.
If you also use socket.io's rooms features (which is basically its own pub/sub mechanism), that means only 5 users will have that message emitted to them over the websocket.
And no, socket.io won't iterate over 1M clients to find out which of them are in room #myroom ;-)
So the message is published once, each subscriber (node.js instance) will get notified once, and only the relevant clients -- socket.io won't waste CPU cycles in order to find them as it keeps track of them when they join() or leave() a room -- will receive the message.
Doesn't that sound pretty efficient and light-weight?
Give Redis a shot.
It's really simple to set-up, runs entirely in memory, blazing-fast, replication is extremely simple, etc.
That's the way socket.io recommends passing events between nodes.
You can find more information/code here.
Additionally, if MongoDB can't handle the load at any point, you can use Redis as your session-store as well.
Hope this helps!

Resources