Socket.io with Hazelcast - node.js

I have a nodejs application with cluster and I use hazelcast for scalebility. I need to add socket.io for real time messaging but when I connect to the socket on the worker processes, my client's socket object has lost.
I research for a while but I didn't find any solutions. There are too many examples with Redis but I cannot use Redis. I must use Hazelcast.
How can I share the sockets to every process and also every server with Hazelcast? Is there a way?
Node version:v10.16.3
Socket.io version: 2.3.0
Thanks in advance.

Hazelcast provides you both pub/sub mechanism and in-memory caching on a JVM-based Hazelcast cluster and you can connect to cluster via Hazelcast node.js client. So, It gives an ability to manage and scale Socket.io application via pub/sub topics and storing socket ids in a Hazelcast IMap datastructure.
I can share an example pseudo-code for you:
/* Assumed multi-instance socket.io servers are working and
** they are listening to the Hazelcast pub/sub topics. So, every socket.io
** servers can communicate via pub/sub topics and your other applications as well.
*/
const topic = await client.getTopic('one-of-socket-rooms');
const map = await client.getMap('user-socket-ids');
io.on('connection', (socket) => {
console.log('a user connected with socketId: ' + socket.id );
// assumed you have already userId
await map.put(userId, socket.id);
socket.join('one-of-socket-rooms');
topic.publish('User with '+ socket.id + ' joined the room')
});
If you can share a specific code piece, I can edit my answer according to your exact requirements.

You can't share sockets across processes. What you can do is distribute messages across processes using redis or some other system. If you may use the long polling fallback of socket.io you will need to ensure sticky sessions is enabled otherwise you need to ensure you are only using websockets. Socket.io provides an out of the box solution to integrating redis into your system to distribute messages across processes if your use case is simple.

Related

What is the best way to run multiple services that use Socket.io

I am developing a website where I will use microservices.
I will have a couple or more Node.js application that will use Socket.io.
I was trying to figure out how I will architecture this.
Can I use multiple Node.js with Socket.io connecting to a user or will I run into conflicts? I can use NGiNX as a proxy a an UUID to identify which microservice to send the request to. Does that make sens? Is there a better way?
Or I was also thinking of using a Node.js has a proxy that receives all the Socket.io connection and then it creates a connection with the user. But this seems to be adding to the network load because I am adding a another microservice.
Anyways, I would love your views on this.
You can setup multi nodejs services and all of them using socket.io-redis lib connect to a Redis server to share socketID of all nodejs services. That mean you storage all socket info on Redis server. When you emit an event they will automatically emit to all nodejs services.
const server = require('http').createServer();
const io = require('socket.io')(server);
const redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
io.on('connection', socket =>{
// put socket of user to a room name by user_id
socket.join(user_id);
});
// emit an event to a specify user, all nodejs keep that user connection will be emit
io.to(user_id).emit('hello', {key: 'ok'})
Depending of how your project goes, it might or might not be a good idea to build microservices communication over Socket.io
The first thing that comes to mind are the poor guarantees of the socket.io messaging system, especially the fact that your messages are not stored on disk. While this might be convenient, this is also a real problem when it comes to building audit trails, debugging or replayability.
Another issue I see is scalability/cluster-ability: yes you can have multiple services communicating over one nodeJS broker, but what happens when you have to add more? How will you pass the message from one broker to the other? This will lead you to building an entire message system and you will find safer to use already existing solutions (RabbitMQ, Kafka...)

Can someone provide an example situation where I would need to use socket.io-redis?

From the socket.io docs [http://socket.io/docs/rooms-and-namespaces/#sending-messages-from-the-outside-world] I read the following but I can't seem to connect it to any use case in my head:
Sending messages from the outside-world In some cases, you might want
to emit events to sockets in Socket.IO namespaces / rooms from outside
the context of your Socket.IO processes.
There’s several ways to tackle this problem, like implementing your
own channel to send messages into the process.
To facilitate this use case, we created two modules:
socket.io-redis
socket.io-emitter
By implementing the Redis Adapter:
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
you can then emit messages from any other process to any channel
var io = require('socket.io-emitter')();
setInterval(function(){
io.emit('time', new Date);
}, 5000);
If you have a cluster of servers and want to talk to clients that are connected to different instances, you'll need a common storage -- that's when you use Redis.
You are also mentioning io-emitter, which is a way for other processes to post messages to your clients. For example, if a worker needs to emit messages to your clients, it can use io-emitter. Redis is the common glue for sharing messages across different processes/servers.
The module is needed only when you want to spread your solution to multiple solutions or node processes. Through the redis adapter the multiple servers could broadcast to other clients.
Basically when you have two servers each running their own server. Server A has three clients. Server B has two different clients. These two servers does not share any client information so you won't be able to broadcast to all the users message. The adapter gives you ability to connect these different servers into one(using redis), so you would be able to broadcast to all the users.
Also good presentation to look about socket.io and redis http://www.slideshare.net/YorkTsai/jsdc2013-28389880.

Sharing object between nodejs instances (high i/o)

Im building a nodejs/socket.io based game and Im trying to implement node clustering to take advantage on multicore machines ( few machines, each has few cores ). I figured out that memcache will be nice solution, but Im not completely sure if it'll survive high load, because each game will do about 50 write/read per second. Also what will be the best solution to broadcast message to all clients while they're connected to different servers. For example player X is connected to node 1, he do a simple action and how I can broqdcast the action to player Y which is connected to node 2.
If you are going to be clustering across threads, then using Redis as your Socket.IO store is a perfect solution.
Redis is an extremely fast database, entirely run from memory, that can support thousands of publish/subscribe messages in a single second. Socket.IO has built-in support for Redis and when using it, any message emitted from one instance of Socket.IO is published to Redis, and all Socket.IO instances that have subscribed to the same Redis instance will also emit the same message.
This is how you would set up each Socket.IO instance:
var RedisStore = require('socket.io/lib/stores/redis');
var redis = require('socket.io/node_modules/redis')
io.set('store', new RedisStore({
redisPub: redis.createClient(),
redisSub: redis.createClient(),
RedisClient: redis.createClient()
}));
When Socket.IO is set up like this, and you have a Redis server, a simple io.sockets.emit() will broadcast to all clients on any server regardless of which server executed the code if all servers are publishing/subscribing to the Redis instance.
You can actually serialize the object into a JSON string and pass it as argument and de-serialize at each worker (num of CPUs).
Otherwise you can use cluster.worker[<id>].send(<object>) which will automatically take care of serializing/de-serializing.
For more information: Check Node API docs

How does socket.io send messages across multiple servers?

The Socket.io API has the ability to send messages to all clients.
With one server and all sockets in memory, I understand how that server one can send a message to all its clients, that's pretty obvious. But what about with multiple servers using Redis to store the sockets?
If I have client a connected to server y and client b connected to server z (and a Redis box for the store) and I do socket.broadcast.emit on one server, the client on the other server will receive this message. How?
How do the clients that are actually connected to the other server get that message?
Is one server telling the other server to send a message to its connected client?
Is the server establishing its own connection to the client to send that message?
Socket.io uses MemoryStore by default, so all the connected clients will be stored in memory making it impossible (well, not quiet but more on that later) to send and receive events from clients connected to a different socket.io server.
One way to make all the socket.io servers receive all the events is that all servers use redis's pub-sub. So, instead using socket.emit one can publish to redis.
redis_client = require('redis').createClient();
redis_client.publish('channelName', data);
And all the socket servers subscribe to that channel through redis and upon receiving a message emit it to clients connected to them.
redis_sub = require('redis').createClient();
redis_sub.subscribe('channelName', 'moreChannels');
redis_sub.on("message", function (channel, message) {
socket.emit(channel, message);
});
Complicated Stuff !! But wait, turns out you dont actually need this sort of code to achieve the goal. Socket.io has RedisStore which essentially does what the code above is supposed to do in a nicer way so that you can write Socket.io code as you would write for a single server and will still get propagated over to other socket.io server through redis.
To summarise socket.io sends messages across multiple servers by using redis as the channel instead of memory.
There are a few ways you can do this. More info in this question. A good explanation of how pub/sub in Redis works is here, in Redis' docs. An explanation of how the paradigm works in general is here, on Wikipedia.
Quoting the Redis docs:
SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe
messaging paradigm where (citing Wikipedia) senders (publishers) are
not programmed to send their messages to specific receivers
(subscribers). Rather, published messages are characterized into
channels, without knowledge of what (if any) subscribers there may be.
Subscribers express interest in one or more channels, and only receive
messages that are of interest, without knowledge of what (if any)
publishers there are. This decoupling of publishers and subscribers
can allow for greater scalability and a more dynamic network topology.

Examples in using RedisStore in socket.io

I am trying to scale a simple socket.io app across multiple processes and/or servers.
Socket.io supports RedisStore but I'm confused as to how to use it.
I'm looking at this example,
http://www.ranu.com.ar/post/50418940422/redisstore-and-rooms-with-socket-io
but I don't understand how using RedisStore in that code would be any different from using MemoryStore. Can someone explain it to me?
Also what is difference between configuring socket.io to use redisstore vs. creating your own redis client and set/get your own data?
I'm new to node.js, socket.io and redis so please point out if I missed something obvious.
but I don't understand how using RedisStore in that code would be any different from using MemoryStore. Can someone explain it to me?
The difference is that when using the default MemoryStore, any message that you emit in a worker will only be sent to clients connected to the same worker, since there is no IPC between the workers. Using the RedisStore, your message will be published to a redis server, which all your workers are subscribing to. Thus, the message will be picked up and broadcast by all workers, and all connected clients.
Also what is difference between configuring socket.io to use redisstore vs. creating your own redis client and set/get your own data?
I'm not intimately familiar with RedisStore, and so I'm not sure about all differences. But doing it yourself would be a perfectly valid practice. In that case, you could publish all messages to a redis server, and listen to those in your socket handler. It would probably be more work for you, but you would also have more control over how you want to set it up. I've done something similar myself:
// Publishing a message somewhere
var pub = redis.createClient();
pub.publish("messages", JSON.stringify({type: "foo", content: "bar"}));
// Socket handler
io.sockets.on("connection", function(socket) {
var sub = redis.createClient();
sub.subscribe("messages");
sub.on("message", function(channel, message) {
socket.send(message);
});
socket.on("disconnect", function() {
sub.unsubscribe("messages");
sub.quit();
});
});
This also means you have to take care of more advanced message routing yourself, for instance by publishing/subscribing to different channels. With RedisStore, you get that functionality for free by using socket.io channels (io.sockets.of("channel").emit(...)).
A potentially big drawback with this is that socket.io sessions are not shared between workers. This will probably mean problems if you use any of the long-polling transports.
I set up a small github project to use redis as datastore.
Now you can run multiple socket.io server processes.
https://github.com/markap/socket.io-scale
Also what is difference between configuring socket.io to use redisstore vs. creating your own redis client and set/get your own data?
The difference is that, when you use 'RedisStore', the socket.io itself will save the socket heartbeat and session info into the Redis, and if you use cluster with node.js, the user client can work.
Without redis, the client might change the node.js process next time, so the session will be lost.
The difference is, if you have a cluster of node.js instances running, memStore won't work since it's only visible to a single process.

Resources