Load Balance: Node.js - Socket.io - Redis - node.js

I have 3 Servers running NodeJs, and they are related each other with Redis (1 master, 2 slaves).
The issue i'm having is that running the system on a single server works fine, but when I scale it to 3 NodeJS servers, it starts missing messages and the system gets unstable.
My load balancer does not accept sticky sessions. So every time that the requests from the client arrives to it, they can go to a different server.
I'm pointing all the NodeJS servers to the Redis Master.
It looks like socket.io is storing information on each server and it is not being distributed with redis.
I'm using socket.io V9, I'm suspecting that I don't have any handshake code, could this be the reason?
My code to configure socket.io is:
var express = require('express');
var io = require('socket.io');
var redis = require('socket.io/node_modules/redis');
var RedisStore = require('socket.io/lib/stores/redis');
var pub = redis.createClient("a port", "an ip");
var sub = redis.createClient("a port", "an ip");
var client = redis.createClient("a port", "an ip");
var events = require('./modules/eventHandler');
exports.createServer = function createServer() {
var app = express();
var server = app.listen(80);
var socketIO = io.listen(server);
socketIO.configure(function () {
socketIO.set('store', new RedisStore({
redisPub: pub,
redisSub: sub,
redisClient: client
}));
socketIO.set('resource', '/chat/socket.io');
socketIO.set('log level', 0);
socketIO.set('transports', [, 'htmlfile', 'xhr-polling', 'jsonp-polling']);
});
// attach event handlers
events.attachHandlers(socketIO);
// return server instance
return server;
};

Redis only syncs from the master to the slaves. It never syncs from the slaves to the master. So, if you're writing to all 3 of your machines, then the only messages that will wind up synced across all three servers will be the ones hitting the master. This is why it looks like you're missing messages.
More info here.
Read only slave
Since Redis 2.6 slaves support a read-only mode that
is enabled by default. This behavior is controlled by the
slave-read-only option in the redis.conf file, and can be enabled and
disabled at runtime using CONFIG SET.
Read only slaves will reject all
the write commands, so that it is not possible to write to a slave
because of a mistake. This does not mean that the feature is conceived
to expose a slave instance to the internet or more generally to a
network where untrusted clients exist, because administrative commands
like DEBUG or CONFIG are still enabled. However security of read-only
instances can be improved disabling commands in redis.conf using the
rename-command directive.
You may wonder why it is possible to revert
the default and have slave instances that can be target of write
operations. The reason is that while this writes will be discarded if
the slave and the master will resynchronize, or if the slave is
restarted, often there is ephemeral data that is unimportant that can
be stored into slaves. For instance clients may take information about
reachability of master in the slave instance to coordinate a fail over
strategy.

I arrived to this post:
It can be a good idea to have a "proxy" between nodejs servers and the load balancer.
With this approach XHR-Polling can be used in load balancers without Sticky sessions.
Load balancing with node.js using http-proxy
using nodejs-http-proxy i can have custom routing route, ex. by adding a parameter on the "connect url" of socket.io.
Anyone tried this solution before?

Related

Sticky Session on Heroku

We have have a NodeJS application running with SocketIO and clustering on heroku. To get SocketIO working we use the redis-adapter like discussed here: https://socket.io/docs/using-multiple-nodes/.
Then we've implemented sticky sessions like shown in the sticky session documentation here: https://github.com/elad/node-cluster-socket.io.
Turns out that when we deploy to Heroku, the connection.remoteAddress in:
// Create the outside facing server listening on our port.
var server = net.createServer({ pauseOnConnect: true }, function(connection) {
// We received a connection and need to pass it to the appropriate
// worker. Get the worker for this connection's source IP and pass
// it the connection.
var index = worker_index(connection.remoteAddress, num_processes);
var worker = workers[index];
worker.send('sticky-session:connection', connection);
}).listen(port);
is actually the IP address of some heroku routing server and NOT the client IP. I've seen that the request header "x-forwarded-for" could be used to get the client IP, but when we pause the connection in this way, we don't even have the headers yet?
We searched all over for a solution, but apparently there's no good solutions.
Here are some of the better suggestions:
https://github.com/indutny/sticky-session/issues/6
https://github.com/indutny/sticky-session/pull/45
None of them seemed good performance wise and therefore we ended up changing SocketIO communication to Websockets only. This eliminates the need for sticky sessions all together.

Node.js + Socket.IO scaling with redis + cluster

Currently, I'm faced with the task where I must scale a Node.js app using Amazon EC2. From what I understand, the way to do this is to have each child server use all available processes using cluster, and have sticky connections to ensure that every user connecting to the server is "remembered" as to what worker they're data is currently on from previous sessions.
After doing this, the next best move from what I know is to deploy as many servers as needed, and use nginx to load balance between all of them, again using sticky connections to know which "child" server that each users data is on.
So when a user connects to the server, is this what happens?
Client connection -> Find/Choose server -> Find/Choose process -> Socket.IO handshake/connection etc.
If not, please allow me to better understand this load balancing task. I also do not understand the importance of redis in this situation.
Below is the code I'm using to use all CPU's on one machine for a seperate Node.js process:
var express = require('express');
cluster = require('cluster'),
net = require('net'),
sio = require('socket.io'),
sio_redis = require('socket.io-redis');
var port = 3502,
num_processes = require('os').cpus().length;
if (cluster.isMaster) {
// This stores our workers. We need to keep them to be able to reference
// them based on source IP address. It's also useful for auto-restart,
// for example.
var workers = [];
// Helper function for spawning worker at index 'i'.
var spawn = function(i) {
workers[i] = cluster.fork();
// Optional: Restart worker on exit
workers[i].on('exit', function(worker, code, signal) {
console.log('respawning worker', i);
spawn(i);
});
};
// Spawn workers.
for (var i = 0; i < num_processes; i++) {
spawn(i);
}
// Helper function for getting a worker index based on IP address.
// This is a hot path so it should be really fast. The way it works
// is by converting the IP address to a number by removing the dots,
// then compressing it to the number of slots we have.
//
// Compared against "real" hashing (from the sticky-session code) and
// "real" IP number conversion, this function is on par in terms of
// worker index distribution only much faster.
var worker_index = function(ip, len) {
var s = '';
for (var i = 0, _len = ip.length; i < _len; i++) {
if (ip[i] !== '.') {
s += ip[i];
}
}
return Number(s) % len;
};
// Create the outside facing server listening on our port.
var server = net.createServer({ pauseOnConnect: true }, function(connection) {
// We received a connection and need to pass it to the appropriate
// worker. Get the worker for this connection's source IP and pass
// it the connection.
var worker = workers[worker_index(connection.remoteAddress, num_processes)];
worker.send('sticky-session:connection', connection);
}).listen(port);
} else {
// Note we don't use a port here because the master listens on it for us.
var app = new express();
// Here you might use middleware, attach routes, etc.
// Don't expose our internal server to the outside.
var server = app.listen(0, 'localhost'),
io = sio(server);
// Tell Socket.IO to use the redis adapter. By default, the redis
// server is assumed to be on localhost:6379. You don't have to
// specify them explicitly unless you want to change them.
io.adapter(sio_redis({ host: 'localhost', port: 6379 }));
// Here you might use Socket.IO middleware for authorization etc.
console.log("Listening");
// Listen to messages sent from the master. Ignore everything else.
process.on('message', function(message, connection) {
if (message !== 'sticky-session:connection') {
return;
}
// Emulate a connection event on the server by emitting the
// event with the connection the master sent us.
server.emit('connection', connection);
connection.resume();
});
}
I believe your general understanding is correct, although I'd like to make a few comments:
Load balancing
You're correct that one way to do load balancing is having nginx load balance between the different instances, and inside each instance have cluster balance between the worker processes it creates. However, that's just one way, and not necessarily always the best one.
Between instances
For one, if you're using AWS anyway, you might want to consider using ELB. It was designed specifically for load balancing EC2 instances, and it makes the problem of configuring load balancing between instances trivial. It also provides a lot of useful features, and (with Auto Scaling) can make scaling extremely dynamic without requiring any effort on your part.
One feature ELB has, which is particularly pertinent to your question, is that it supports sticky sessions out of the box - just a matter of marking a checkbox.
However, I have to add a major caveat, which is that ELB can break socket.io in bizarre ways. If you just use long polling you should be fine (assuming sticky sessions are enabled), but getting actual websockets working is somewhere between extremely frustrating and impossible.
Between processes
While there are a lot of alternatives to using cluster, both within Node and without, I tend to agree cluster itself is usually perfectly fine.
However, one case where it does not work is when you want sticky sessions behind a load balancer, as you apparently do here.
First off, it should be made explicit that the only reason you even need sticky sessions in the first place is because socket.io relies on session data stored in-memory between requests to work (during the handshake for websockets, or basically throughout for long polling). In general, relying on data stored this way should be avoided as much as possible, for a variety of reasons, but with socket.io you don't really have a choice.
Now, this doesn't seem too bad, since cluster can support sticky sessions, using the sticky-session module mentioned in socket.io's documentation, or the snippet you seem to be using.
The thing is, since these sticky sessions are based on the client's IP, they won't work behind a load balancer, be it nginx, ELB, or anything else, since all that's visible inside the instance at that point is the load balancer's IP. The remoteAddress your code tries to hash isn't actually the client's address at all.
That is, when your Node code tries to act as a load balancer between processes, the IP it tries to use will just always be the IP of the other load balancer, that balances between instances. Therefore, all requests will end up at the same process, defeating cluster's whole purpose.
You can see the details of this issue, and a couple of potential ways to solve it (none of which particularly pretty), in this question.
The importance of Redis
As I mentioned earlier, once you have multiple instances/processes receiving requests from your users, in-memory storage of session data is no longer sufficient. Sticky sessions are one way to go, although other, arguably better solutions exist, among them central session storage, which Redis can provide. See this post for a pretty comprehensive review of the subject.
Seeing as your question is about socket.io, though, I'll assume you probably meant Redis's specific importance for websockets, so:
When you have multiple socket.io servers (instances/processes), a given user will be connected to only one such server at any given time. However, any of the servers may, at any time, wish to emit a message to a given user, or even a broadcast to all users, regardless of which server they're currently under.
To that end, socket.io supports "Adapters", of which Redis is one, that allow the different socket.io servers to communicate among themselves. When one server emits a message, it goes into Redis, and then all servers see it (Pub/Sub) and can send it to their users, making sure the message will reach its target.
This, again, is explained in socket.io's documentation regarding multiple nodes, and perhaps even better in this Stack Overflow answer.

How to check socket is alive (connected) in socket.io with multiple nodes and socket.io-redis

I am using socket.io with multiple nodes, socket.io-redis and nginx. I follow this guide: http://socket.io/docs/using-multiple-nodes/
I am trying to do: At a function (server site), I want to query by socketid that this socket is connected or disconnect
I tried io.of('namespace').connected[socketid], it only work for current process ( it mean that it can check for current process only).
Anyone can help me? Thanks for advance.
How can I check socket is alive (connected) with socketid I tried
namespace.connected[socketid], it only work for current process.
As you said, separate process means that the sockets are only registered on the process that they first connected to. You need to use socket.io-redis to connect all your nodes together, and what you can do is broadcast an event each time a client connects/disconnects, so that each node has an updated real-time list of all the clients.
Check out here
as mentioned above you should use socket.io-redis to get it work on multiple nodes.
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
I had the same problem and no solution at my convenience. So I made a log of the client to see the different methods and variable that I can use. there is the client.conn.readystate property for the state of the connection "open/closed" and the client.onclose() function to capture the closing of the connection.
const server = require('http').createServer(app);
const io = require('socket.io')(server);
let clients = [];
io.on('connection', (client)=>{
clients.push(client);
console.log(client.conn.readyState);
client.onclose = ()=>{
// do something
console.log(client.conn.readyState);
clients.splice(clients.indexOf(client),1);
}
});
When deploying Socket.IO application on a multi-nodes cluster, that means multiple SocketIO servers, there are two things to take care of:
Using the Redis adapter and Enabling the sticky session feature: when a request comes from a SocketIO client (browser) to your app, it gets associated with a particular session-id, these requests must be kept connecting with the same process (Pod in Kubernetes) that originated their ids.
you can learn more about this from this Medium story (source code available) https://saphidev.medium.com/socketio-redis...

Socket.io and multiple Dyno's on Heroku Node.js app. WebSocket is closed before the connection is established

I'm building an App deployed to Heroku which uses Websockets.
The websockets connection is working properly when I use only 1 dyno, but when I scale to >1, I get the following errors
POST
http://****.herokuapp.com/socket.io/?EIO=2&transport=polling&t=1412600135378-1&sid=zQzJJ8oPo5p3yiwIAAAC
400 (Bad Request) socket.io-1.0.4.js:2
WebSocket connection to
'ws://****.herokuapp.com/socket.io/?EIO=2&transport=websocket&sid=zQzJJ8oPo5p3yiwIAAAC'
failed: WebSocket is closed before the connection is established.
socket.io-1.0.4.js:2
I am using the Redis adaptor to enable multiple web processes
var io = socket.listen(server);
var redisAdapter = require('socket.io-redis');
var redis = require('redis');
var pub = redis.createClient(18049, '[URI]', {auth_pass:"[PASS]"});
var sub = redis.createClient(18049, '[URI]', {detect_buffers: true, auth_pass:"[PASS]"} );
io.adapter( redisAdapter({pubClient: pub, subClient: sub}) );
This is working on localhost (which I am using foreman to run, as Heroku does, and I am launching 2 web processes, same as on Heroku).
Before I implemented the redis adaptor I got a web-sockets handshake error, so the adaptor has had some effect. Also it is working occasionally now, I assume when the sockets match the same web dyno.
I have also tried to enable sticky sessions, but then it never works.
var sticky = require('sticky-session');
sticky(1, server).listen(port, function (err) {
if (err) {
console.error(err);
return process.exit(1);
}
console.log('Worker listening on %s', port);
});
I'm the Node.js Platform Owner at Heroku.
WebSockets works on Heroku out-of-the-box across multiple dynos; socket.io (and other realtime libs) use fallbacks to stateless processes like xhr polling that break without session affinity.
To scale up socket.io apps, first follow all the instructions from socket.io:
http://socket.io/docs/using-multiple-nodes/
Then, enable session affinity on your app (this is a free feature):
https://devcenter.heroku.com/articles/session-affinity
I spent a while trying to make socket.io work in multi-server architecture, first on Heroku and then on Openshift as many suggest.
The only way to make it work on both PAAS is disabling xhr-polling and setting transports: ['websocket'] on both client and server.
On Openshift, you must explicitly set the port of the server to 8000 (for ws – 8443 for wss on socket.io client initialization, using the *.rhcloud.com server, as explained in this post: http://tamas.io/deploying-a-node-jssocket-io-app-to-openshift/.
Polling strategy doesn't work on Heroku because it does not support sticky sessions (https://github.com/Automattic/engine.io/issues/261), and on Openshift it fails because of this issue: https://github.com/Automattic/engine.io/issues/279, that will hopefully be fixed soon.
So, the only solution I found so far, is disabling polling and use websocket transport only.
In order to do that, with socket.io > 1.0
server-side:
var app = express();
var server = require('http').createServer(app);
var socketio = require('socket.io')(server, {
path: '/socket.io-client'
});
socketio.set('transports', ['websocket']);
client-side:
var ioSocket = io('<your-openshift-app>.rhcloud.com:8000' || '<your-heroku-app>.herokuapp.com', {
path: '/socket.io-client'
transports: ['websocket']
})
Hope this will help.
It could be you need to be running RedisStore:
var session = require('express-session');
var RedisStore = require('connect-redis')(session);
app.use(session({
store: new RedisStore(options),
secret: 'keyboard cat'
}));
per earlier q here: Multiple dynos on Heroku + socket.io broadcasts
I know this isn't a normal answer, but I've tried to get WebSockets working on Heroku for more than a week. After many long conversations with customer support I finally tried out OpenShift. Heroku WebSockets are in beta, but OpenShift WebSockets are stable. I got my code working on OpenShift in under an hour.
http://www.openshift.com
I am not affiliated with OpenShift in any way. I'm just a satisfied (non-paying) customer.
I was having huge problems with this. There were a number of issues failing simultaneously making it a huge nightmare. Make sure you do the following to scale socket.io on heroku:
if you're using clusters make sure you implement socketio-sticky-session or something similar
client's connect url should not be https://example.com/socket.io/?EIO=3&transport=polling but rather https://example.com/ notably I'm using https because heroku supports it
enable cors in socket.io
specify only websocket connections
For you and others it could be any one of these.
if you're having trouble setting up sticky-session clusters here's my working code
var http = require('http');
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
var sticky = require('socketio-sticky-session');
var redis = require('socket.io-redis');
var io;
if(cluster.isMaster){
console.log('Inside Master');
// create the worker processes
for (var i = 0; i < numCPUs ; i++){
cluster.fork();
}
}
else {
// The worker code to be run is written inside
// the sticky().
}
sticky(function(){
// This code runs inside the workers.
// The sticky-session balances connection between workers based on their ip.
// So all the requests from the same client would go to the same worker.
// If multiple browser windows are opened in the same client, all would be
// redirected to the same worker.
io = require('socket.io')({transports:'websocket', 'origins' : '*:*'});
var server = http.createServer(function(req,res){
res.end('socket.io');
})
io.listen(server);
// The Redis server can also be used to store the socket state
//io.adapter(redis({host:'localhost', port:6379}));
console.log('Worker: '+cluster.worker.id);
// when multiple workers are spawned, the client
// cannot connect to the cloudlet.
StartConnect(); //this function connects my mongodb, then calls a function with io.on('connection', ..... socket.on('message'...... in relation to the io variable above
return server;
}).listen(process.env.PORT || 4567, function(){
console.log('Socket.io server is up ');
});
more information:
personally it would work flawlessly from a session not using websockets (I'm using socket.io for a unity game. It worked flawlessly from the editor only!). When connecting through the browser whether chrome or firefox it would show these handshaking errors, along with error 503 and 400.

SocketIO on a Node.js cluster

I have a standalone Node.js app which has SocketIO server that listens on a certain port, e.g. 8888. Now I am trying to run this app in a cluster and because cluster randomly assigns workers to requests, SocketIO clients in XHR polling mode once handshaken and authorized with one worker get routed to another worker where they're not handshaken and the mess begins.
And because workers don't share anything, I can't find a workaround. Is there a known solution to this issue?
There is no "simple" solution. What you have to do is the following:
If a client connects to a worker, save the connection-id together with the worker-id and a potential additional identification-id in a global (=for all workers accessible) store (i.e. redis).
If a client gets routed to another worker, use the store to look up which worker is reponsible for this client (either with the connection-id or with the additional identification-id and then hand it over to that worker (either with the nodejs-worker-master-worker-communication or via redis-pub-sub)
I habe implemented such thing with sock.js and an additional degree of complexity: I have two node.js servers with four workers each, so I had to use redis-pub-sub for worker/worker communication, because it is not guaranteed that they are on the same machine.
Actually there is a simple solution: using Redis to store sockets states.
Everything is explained in Socket.IO documentation:
The default 'session' storage in Socket.IO is in memory (MemoryStore).
The MemoryStore only allows you to deploy socket.io on a single
process. If you want to scale to multiple process and / or multiple
servers you can use our RedisStore which uses the Redis NoSQL database
as man in the middle.
So in order to change the store instance to RedisStore we add this:
var RedisStore = require('socket.io/lib/stores/redis')
, redis = require('socket.io/node_modules/redis')
, pub = redis.createClient()
, sub = redis.createClient()
, client = redis.createClient();
// Needs to be done after 'listen()'
io.set('store', new RedisStore({
redisPub : pub
, redisSub : sub
, redisClient : client
}));
Of course you will need to have a redis server running.

Resources