Get Total No of. Connected Users in Room in Socket.io + multiple process/cluster - node.js

I want to get no of connected users in socket.io and i am having multiple socket.io servers
For ex. when user1 connects to server1 and joins room room1
Below statements returns gives the connected Users list
then
io.nsps['/'].adapter.rooms["room1"]
or
io.sockets.adapter.rooms["room1"]
And also i tried many other solutions available on SO, and google.
But when i use above statements in server2 it does not return anything.
How can i get no of connected user in all server for specific room ?
I am using socket.io-redis module to communication between multiple processes.

If you want to share the sessions/rooms etc, you probably need to use redis.
Here there's the doc for using multiple nodes (which is what you have as far as I understand) and to use redis to pass events between one and the other nodes. http://socket.io/docs/using-multiple-nodes/#passing-events-between-nodes
And yes, to check the sessions in one room the following is correct io.nsps['/'].adapter.rooms["room1"].
Oops, just read now that you state about using socket.io-redis already. If it is configured properly it should work, at least it does for me.

Related

how to access socket session in all clusters

I am working on setting up socket.io in cluster mode using PM2.
I am using socket.io-redis package and it works fine in cluster mode.
But the problem arises when I want to access all connected sockets. Because processes don't know about socket connections in other processes in cluster mode.
I thought socket.io-redis keeps track of all the connected sockets and all its session info but it didn't.
Is there any way or solution to access all the socket connection existing in all processes in socket.io/Nodejs?
Socket.io-redis does keep track in a sense..
From their docs
"The Redis adapter extends the broadcast function of the in-memory adapter: the packet is also published to a Redis channel (see below for the format of the channel name).
Each Socket.IO server receives this packet and broadcasts it to its own list of connected sockets."
So basically, redis is used as the broker to tell each socket server to emit based on X channel etc. Allowing you to have a socket.io server in cluster mode work, but as you have mentioned it can fall short when you need to keep track of things outside of just an emit.
So where does this leave us.. Well you can use custom hooks via socket.io-redis but personally I found it to be really difficult to understand and use and had limited success personally. I think with the new version of socket.io and socket.io redis there were some tweaks to make this simpler however I have not tried them.
Instead, what we do is use redis hset and jget to store the socket and an ID of a users, then when we want to get all users online we can query redis to get the list of online users or users in a specific room etc.
What you will want to do is add the redis package and connect in additon to the regular pub / sub.
Then, when a user joins a room or your server for that matter you will do an hset. On the first join ours looks something like this
redis.hset([collection-name],[Field],[value])
So in code it looks like
redis.hset(decoded.cID,"socket-" + socket.id,socket.nickname)
This will set a value in redis, so the collection name is a value ( for us its a unique id of the channel ) then we stock the 'socket.id' for the Field along with a 'nick-name' for the value. This value is the users ID OR its anonymous if they are not logged in
Then, when we want to grab who is in a room we use the hget command
redis.HGETALL([collection-name],function(err,results){}
So inside of say the emit, we call the redis.HGETALL command to get all items inside a specific collection that we pass in and send that back to all connected users.

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Uniqueness of socket.id of websocket in distributed web applications

Suppose you want to scale your web application that relies on websocket technology (e.g Socket.io library). You have multiple servers and you are using a shared database (e.g Redis) to have a communication with these servers.
In this case you store socket ids of each socket connection in that db. My question is:
Is it possible that two users that are connected to two different
servers get the same socket.id so that you couldn't differentiate them?
In this case, if you want to notify a specific user, you will do it for another user too!
How it is possible?
If it is possible how people solve this problem in real world use cases ?
Is there any trick in programming or in the design ?
EDIT
I want to emphasize on distributed environments.

Node.js tcp socket server on multiple machines

I have a node.js tcp server that is used as a backend to an iPhone chat client. Since my implementation includes private group chats I store a list of users and what chat room they belong to in memory in order to route messages appropriately. This all works for fine assuming my chat server will always be on one machine, but when/if I need to scale horizontally I need a good way of broadcasting messages to clients that connect to different servers. I don't want to start doing inter-process communication between node servers and would prefer sharing state with redis.
I have a few ideas but I'm wondering if anyone has a good solution for this? To be clear here is an example:
User 1 connects to server 1 on room X, user 2 connects to server 2 on room X. User 1 sends a message, I need this to be passed to user 2, but since I am using an in memory data structure the servers don't share state. I want my node servers to remain as dumb as possible so I can just add/remove to the needs of my system.
Thanks :)
You could use a messaging layer (using something like pub/sub) that spans the processes:
Message Queue
-------------------------------------------------------------------------------
| |
ServerA ServerB
------- -------
Room 1: User1, User2 Room 1: User3, User5
Room 2: User4, User7, User11 Room 2: User6, User8
Room 3: User9, User13 Room 3: User10, User12, User14
Let's say User1 sends a chat message. ServerA sends a message on the message queue that says "User1 in Room 1 said something" (along with whatever they said). Each of your other server processes listens for such events, so, in this example, ServerB will see that it needs to distribute the message from User1 to all users in its own Room 1. You can scale to many processes in this way--each new process just needs to make sure they listen to appropriate messages on the queue.
Redis has pub/sub functionality that you may be able to use for this if you're already using Redis. Additionaly, there are other third-party tools for this kind of thing, like ZeroMQ; see also this question.
Redis is supposed to have built in cluster support in the near future, in the mean time you can use a consistent hashing algorithm to distribute your keys evenly across multiple servers. Someone out there has a hashing module for node.js, which was written specifically to implement consistent hashing for a redis cluster module for node.js. You might want to key off the 'room' name to ensure that all data points for a room wind up on the same host. With this type of setup all the logic for which server to use remains on the client, so your redis cluster can basically remain the same and you can easily add or remove hosts.
Update
I found the consistent hashing implementation for redis I was talking about, it gives the code of course, and also explains sharding in an easy to digest way.
http://ngchi.wordpress.com/2010/08/23/towards-auto-sharding-in-your-node-js-app/

How to get Node.js processes communicate with one another

I have an nodejs chat app where multiple clients connect to a common chat room using socketio. I want to scale this to multiple node processes, possibly on different machines. However, clients that connect to the same room will not be guaranteed to hit the same node process. For example user 1 will hit node process A and user 2 will hit node process B. They are in the same room so if user 1 sends a message, user 2 should get it. What's the best way to make this happen since their connections are managed by different processes?
I thought about just having the node processes connect to redis. This at least solves the problem that process A will know there's another user, user 2, in the room but it still can't send to user 2 because process B controls that connection. Is there a way to register a "value changed" callback for redis?
I'm in a server environment where I can't control any of the routing or load balancing.
Both node.js processes can be subscribed to some channel through redis pub/sub and listen to messages which you pass to this channel. For example, when user 1 connects to process A on the first machine, you can store in redis information about this user along with the information which process on which machine manages it. Then when user 2, which is connected to process B on the second machine, sends a message to user 1, you can publish it to this channel and check which process on which machine is responsible for managing communication with user 1 and respond accordingly.
I have done(did) some research on this. Below my findings:
Like yojimbo87 said you first just use redis pub/sub(is very optimized).
http://comments.gmane.org/gmane.comp.lang.javascript.nodejs/22348
Tim Caswell wrote:
It's been my experience that the bottleneck is the serialization and
de-serialization of the data, not the actual channel. I'm pretty sure
you can use named pipes, but I'm not sure what the API is. msgpack
seems like a good format for the data interchange. There are a few
libraries out there that implement msgpack or ipc frameworks on top of
it.
But when serialization / deserialization becomes your bottle-neck I would try to use https://github.com/pgriess/node-msgpack. I would also like to test this out, because I think the sooner you have this the better?

Resources