I'm curious that I can use both socket.io and Cluster.
I know that cluster uses multi-core to work on node.js with multiple workers.
That means if I use cluster for socket.io, two users with connected on
two different socket.io might cause problem that they cannot communicate each other?
So rather not using cluster on socket.io would be an answer?
Checkout dshaw's talk and sample app regarding scaling Socket.IO: https://github.com/dshaw/talks/tree/master/2011-10-jsclub/sample-app
Also this stackoverflow question might help:
How to reuse redis connection in socket.io?
Basically use Redis as a pub-sub with one or multiple channels on which messages are exchanged.
Related
When using apache pulsar in k8s, is it possible to create multiple pulsar clients in one application and connect to pulsar at the same time? I'm aware that the convention is to use a single pulsar client but i'm curious if multiple clients are possible. Would this not cause any errors?
This depends on client library, most of them allow it, but some of them don't. Having one client is not just a convention, but a performance rule - many things are cached on client layer, so you generally should avoid create multiple clients.
I use Redis because it allows me to scale my applications horizontally (multiple servers). By using it's pub sub features all my servers can communicate with each other without needing to share memory.
So far, cool! We can add more nodejs servers, BUT all this servers subscribe to one single Redis server. So we have the situation in which we have many NodeJs servers communication to just one Redis server, we can serve more clients but still we have one Redis.
From my tests the Redis server uses less resources so can handle more, but still in this design I think is a SPF. What do you think?
What are the best ways to design a scalable system? I know about master/slave Redis but still I am not convinced if it is the best solution.
Yes, Redis is a single point of failure in what you describe. Not only in a sense that when it is down then your app is down, but also in the sense that if one of your processes remove or corrupt the data then it is lost forever.
What you can do is use multiple Redis servers and have a good backup strategy.
See this tutorial for clustering:
https://redis.io/topics/cluster-tutorial
See these tutorials for backups:
http://zdk.blinkenshell.org/redis-backup-and-restore/
https://redis.io/topics/persistence
https://www.digitalocean.com/community/tutorials/how-to-back-up-and-restore-your-redis-data-on-ubuntu-14-04
I want to use socket.io, but I'll be running multiple instances of my app, so that's where things get interesting.
I need to run multiple instances on different ports. No problem here.
I decided not to use Node's own cluster, I'll use Nginx for load balancing (that's why I create multiple instances of the app). Nginx supports websockets, so this one is also sorted out.
Given that there'll be multiple instances, not all of them can't talk to each other directly (user A connects to instance X, if user B is connected to instance Y, they can't communicate since servers are independent from each other), so I need to use Redis' pub/sub mechanism as a wrapper in order to mimick socketio's emit & broadcast functionality. This way even if I have multiple instances of an app or run it on different servers, everybody will be able to talk to each other as long as they're connected to the same Redis server. To achieve this, I'll need to use socket.io-redis and socket.io-emitter modules.
Have I got that right, is there something wrong with this approach?
Sounds about right probably except are you sure you need multiple instances of your Node app on different ports? Because you might be able to handle more load than you would assume with one server and there are ways to separate out different communications channels with socket.io.
Before starting to write my application I need to know what to do when a single node.js instance (express and (socket.io or nowjs)) isn't enough anymore.
You might tell me now, that I shouldn't care about scale until it's about time but I don't want to develop an application and run into trouble because you can't easily scale socket.io or nowjs across multiple instances.
I recently read that socket.io now supports a way to scale using Redis (which I also have no experience in). Nowjs is build on to of socket.io - does it work the same way? On nowjs.org you can read that a "distributed version of NowJS" is under development and is going to cost money.
If you need to scale node, the first place people usually start is putting a load balancer in front of multiple node instances. The standard for this today is nginx, though I would would like to check out the node balancer 'bouncy' that came out recently. Here's an example of someone using the nginx reverse proxy to manage multiple node instances:
Node.js + Nginx - What now?
The second thing you mention is socket.io/nowjs. Depending on how you're using these frameworks, you could get into a situation where you want to share context between clients who are hitting multiple node.js instances. If this is the case, I would recommend using a persistent store, like redis, to bridge the gap between your node instances. Here's an example:
How to reuse redis connection in socket.io?
Hopefully this is enough information and reading to get you started, let me know if you have any questions.
Happy coding!
Another useful link on 'Scaling Socket.IO' https://github.com/dshaw/talks/tree/master/2011-10-jsclub (slides and sample application)
Just as a sidenote on the discussion to use nginx for reverse proxy with socket.io, the way I understand it at least, nginx 1.0.x which is stable version does not support proxying of http/1.1 connections (which is needed in order to make socket.io work with websockets). there is a workaround described on here: http://www.letseehere.com/reverse-proxy-web-sockets to make it work, or use something like this: https://github.com/nodejitsu/node-http-proxy instead, the guys at nodejitsu says this should support it.
I want to scale my Node.js Socket application vertically and horizontally and I haven´t found a sophisticated solution yet.
My application has two use-cases:
Broadcast messages from one user to all others
Push messages from one user to a subset of users
On one hand, I´ve read that I need Redis for both cases together with socket.io-redis
On the other hand, I´ve watched this video and read this SO answer where it says that Redis isn´t reliable and it´s not guaranteed that the published messages will arrive, so you should only use it for clustering/vertical scaling
Microsoft Azures solution to use ServiceBus is out of question, because I don´t want to use Azure.
Instead of Redis, the guy recommends using RabbitMQ for horizontal scaling.
For the vertical scaling there is also socket.io-clusterhub, an IPC for node processes, but it seems to work only on Socket.io <= v0.9.0
Then there is this guy, who has implemented his own method to pass messages to other nodes via HTTP requests, which makes somehow sense. But why HTTP requests if you could also establish direct socket connections between servers, push the message to all servers simultaneously and overcome the delay of going from one server to another?
As a conclusion I thought maybe I could go with Redis on EACH server, just for the exchange of messages when clustering my application on multiple processes, together with RabbitMQ as a S2S communication solution.
But it seems a bit like an overkill to have one Redis per Server and another central RabbitMQ.
Is there any known shorter/better solution to scale Socket.io reliably in both directions?
EDIT:
I´ve tried using a single Redis Server for multiple Node.js Servers, where each of them uses Clustering via sticky-session over all cores. While the Clustering at its own works like a charm with redis, there seems to be a problem when using multiple servers. Messages won´t arrive at the other nodes.
I'd say Kafka is a good fit for the horizontal scaling. It is a fairly sophisticated way of distributing a huge amount of events across servers (which at the end is what you want). This is a good read about it: https://engineering.linkedin.com/kafka/running-kafka-scale
Regarding the vertical scale, instead of socket.io-clusterhub I would use something called PM2 (https://github.com/Unitech/pm2) which allows you to resize the scale of the apps in every computer dynamically as well as controlling the logs and reporting to keymetrics.io (if you are using it).
If you need any snippets ask me and I will edit the answer but in the PM2 github there are quite few.