Scalable architecture for socket.io

Scalable architecture for socket.io - node.js

I am new to socket.io and Node JS and I am trying to build a scalable application with a high number of simultaneous socket connections (10,000+).
Currently, I started on a model where my server creates child process, and every child process listens a specific port with a sicket.io instance attached. Once a client connects, he is redirected on a specific port.
The big question is : Does having several socket.io instances on several ports increases the number of possible connections ?
Here is my code, just in case :
Server
var server = http.createServer(app);
server.childList = [];
for (var i = 0; i < app.portList.length; i++) {
server.childList[i] = require('child_process').fork('child.js');
}
server.listen(3443, () => {
for (var i = 0; i < app.portList.length; i++) {
server.childList[i].send({ message: 'createServer', port: app.portList[i] });;
}
});
child.js :
var app = require('./app');
var http = require('http');
var socket_io = require( "socket.io" );
process.on('message', (m) => {
if (m.message === 'createServer') {
var childServ = http.createServer(app);
childServ.listen(m.port, () => {
console.log("childServ listening on port "+m.port);
});
var io = socket_io();
io.attach( childServ );
io.sockets.on('connection', function (socket) {
console.log("A client just connected to my socket_io server on port "+m.port);
});
}
});
Feel free to release the kraken if I did something horrible there

First off, what you need to optimize depends on how busy your socket.io connections are and whether the activity is mostly asynchronous I/O operations or whether it's CPU-intensive stuff. As you may already know, node.js scales really well already for asynchronous I/O stuff, but it needs multiple processes to scale well for CPU-intensive stuff. Further, there are some situations where the garbage collector gets too busy (lots and lots of small requests being served) and you also need to go to multiple processes for that reason.
More server instances (up to at least the number of CPUs you have in the server) will give you more CPU processing power (if that's what you need). It won't necessarily increase the number of max connections you can support on a box if most of them are idle. For that, you have to custom tune your server to support lots and lots of connections.
Usually, you would NOT want N socket.io servers each listening on a different port. That puts the burden on the clients to somehow select a port and the client has to know exactly what ports to choose from (e.g. how many server instances you have).
Usually, you don't do it this way. Usually, you have N processes all listening on the same port and you use some sort of loadbalancer to distribute the load among them. This makes the server infrastructure transparent to the clients which means you can scale the servers up or down without changing the client behavior at all. In fact, you can even add more than one physical server box and increase capacity even further that way.
Here's an article from the socket.io doc on using multiple nodes with a load balancer to increase capacity: Socket.io - using multiple nodes (updated link). There's also explicit support by redis for a combination of multiple socket.io instances and redis so you can communicate with any socket.io instance regardless of process.

Does having several socket.io instances on several ports increases the number of possible connections ?
Yes, you have built a simple load-balancer which is a pretty common practice. There are several good tutorials about different ways of scaling node.js.
Horizontally scale socket.io with redis
http://goldfirestudios.com/blog/136/Horizontally-Scaling-Node.js-and-WebSockets-with-Redis
Your load balancer will speed up your code to a point because you utilize multiple threads but I read on some other thread a while ago that a rule of thumb is to start around 2-3 processes per cpu core. More than that cause more overhead then help, but that is highly dependent on situation.

Related

Will this take advantage of multiple CPU cores?

My Node server with Socket.IO runs behind Nginx. I am load balancing it with Nginx. The client is directed to one of these ports:
upstream nodes {
ip_hash;
server localhost:2000;
server localhost:3000;
server localhost:4000;
server localhost:5000;
}
My Node server is set up like this:
function server(port){
const http = require(`http`).createServer((req, res) => {
// http stuff ...
}.listen(port)
const io = require(`socket.io`)(http)
// socket stuff ...
}
server(2000)
server(3000)
server(4000)
server(5000)
Does Node run each of these ports on a different core, or what exactly am I load balancing here?
And should the socket code go inside or outside of the server function?

The code you've posted will run a single server serving four ports on a single CPU core.
You can however run four separate servers on presumably four cores with a small change. Pass the port as a command line argument:
function server(port){
const http = require(`http`).createServer((req, res) => {
// http stuff ...
}.listen(port)
const io = require(`socket.io`)(http)
// socket stuff ...
}
server(process.argv[2])
Then start them as:
node index.js 2000 &
node index.js 3000 &
node index.js 4000 &
node index.js 5000 &
Note: Node.js actually have clustering capabilities built-in. Read the docs if you want to explore clustering further: https://nodejs.org/api/cluster.html

No, it will not generally take significant advantage of multiple cores. node.js is mostly single threaded. Starting 5 servers all within the same node.js process does not give you any more threads or processes which could use the extra cores. If your port 2000 server is running code in service of a request, none of your other servers can run code at the same time. They will have to wait until the port 2000 server yields control back to the one node.js thread. Then, another server can get a request and start executing and all the others will have to wait for it.
What you may be interested in is the clustering module which will actually start multiple node.js processes (all doing the same thing) and then requests will be distributed to the different processes. You do not need nginx to use this type of clustering as the clustering modules does it all with node.js software. There's one server listener and incoming requests are then distributed to the various clustered processes.
You can read more about it in this other answer from earlier today, including a reference to a very good blog post that explains how node.js clustering works:
Node: one core, many processes
While you yourself don't want one core, many processes, many of the architectural issues are covered in that question.

Node.js + Socket.IO scaling with redis + cluster

Currently, I'm faced with the task where I must scale a Node.js app using Amazon EC2. From what I understand, the way to do this is to have each child server use all available processes using cluster, and have sticky connections to ensure that every user connecting to the server is "remembered" as to what worker they're data is currently on from previous sessions.
After doing this, the next best move from what I know is to deploy as many servers as needed, and use nginx to load balance between all of them, again using sticky connections to know which "child" server that each users data is on.
So when a user connects to the server, is this what happens?
Client connection -> Find/Choose server -> Find/Choose process -> Socket.IO handshake/connection etc.
If not, please allow me to better understand this load balancing task. I also do not understand the importance of redis in this situation.
Below is the code I'm using to use all CPU's on one machine for a seperate Node.js process:
var express = require('express');
cluster = require('cluster'),
net = require('net'),
sio = require('socket.io'),
sio_redis = require('socket.io-redis');
var port = 3502,
num_processes = require('os').cpus().length;
if (cluster.isMaster) {
// This stores our workers. We need to keep them to be able to reference
// them based on source IP address. It's also useful for auto-restart,
// for example.
var workers = [];
// Helper function for spawning worker at index 'i'.
var spawn = function(i) {
workers[i] = cluster.fork();
// Optional: Restart worker on exit
workers[i].on('exit', function(worker, code, signal) {
console.log('respawning worker', i);
spawn(i);
});
};
// Spawn workers.
for (var i = 0; i < num_processes; i++) {
spawn(i);
}
// Helper function for getting a worker index based on IP address.
// This is a hot path so it should be really fast. The way it works
// is by converting the IP address to a number by removing the dots,
// then compressing it to the number of slots we have.
//
// Compared against "real" hashing (from the sticky-session code) and
// "real" IP number conversion, this function is on par in terms of
// worker index distribution only much faster.
var worker_index = function(ip, len) {
var s = '';
for (var i = 0, _len = ip.length; i < _len; i++) {
if (ip[i] !== '.') {
s += ip[i];
}
}
return Number(s) % len;
};
// Create the outside facing server listening on our port.
var server = net.createServer({ pauseOnConnect: true }, function(connection) {
// We received a connection and need to pass it to the appropriate
// worker. Get the worker for this connection's source IP and pass
// it the connection.
var worker = workers[worker_index(connection.remoteAddress, num_processes)];
worker.send('sticky-session:connection', connection);
}).listen(port);
} else {
// Note we don't use a port here because the master listens on it for us.
var app = new express();
// Here you might use middleware, attach routes, etc.
// Don't expose our internal server to the outside.
var server = app.listen(0, 'localhost'),
io = sio(server);
// Tell Socket.IO to use the redis adapter. By default, the redis
// server is assumed to be on localhost:6379. You don't have to
// specify them explicitly unless you want to change them.
io.adapter(sio_redis({ host: 'localhost', port: 6379 }));
// Here you might use Socket.IO middleware for authorization etc.
console.log("Listening");
// Listen to messages sent from the master. Ignore everything else.
process.on('message', function(message, connection) {
if (message !== 'sticky-session:connection') {
return;
}
// Emulate a connection event on the server by emitting the
// event with the connection the master sent us.
server.emit('connection', connection);
connection.resume();
});
}

I believe your general understanding is correct, although I'd like to make a few comments:
Load balancing
You're correct that one way to do load balancing is having nginx load balance between the different instances, and inside each instance have cluster balance between the worker processes it creates. However, that's just one way, and not necessarily always the best one.
Between instances
For one, if you're using AWS anyway, you might want to consider using ELB. It was designed specifically for load balancing EC2 instances, and it makes the problem of configuring load balancing between instances trivial. It also provides a lot of useful features, and (with Auto Scaling) can make scaling extremely dynamic without requiring any effort on your part.
One feature ELB has, which is particularly pertinent to your question, is that it supports sticky sessions out of the box - just a matter of marking a checkbox.
However, I have to add a major caveat, which is that ELB can break socket.io in bizarre ways. If you just use long polling you should be fine (assuming sticky sessions are enabled), but getting actual websockets working is somewhere between extremely frustrating and impossible.
Between processes
While there are a lot of alternatives to using cluster, both within Node and without, I tend to agree cluster itself is usually perfectly fine.
However, one case where it does not work is when you want sticky sessions behind a load balancer, as you apparently do here.
First off, it should be made explicit that the only reason you even need sticky sessions in the first place is because socket.io relies on session data stored in-memory between requests to work (during the handshake for websockets, or basically throughout for long polling). In general, relying on data stored this way should be avoided as much as possible, for a variety of reasons, but with socket.io you don't really have a choice.
Now, this doesn't seem too bad, since cluster can support sticky sessions, using the sticky-session module mentioned in socket.io's documentation, or the snippet you seem to be using.
The thing is, since these sticky sessions are based on the client's IP, they won't work behind a load balancer, be it nginx, ELB, or anything else, since all that's visible inside the instance at that point is the load balancer's IP. The remoteAddress your code tries to hash isn't actually the client's address at all.
That is, when your Node code tries to act as a load balancer between processes, the IP it tries to use will just always be the IP of the other load balancer, that balances between instances. Therefore, all requests will end up at the same process, defeating cluster's whole purpose.
You can see the details of this issue, and a couple of potential ways to solve it (none of which particularly pretty), in this question.
The importance of Redis
As I mentioned earlier, once you have multiple instances/processes receiving requests from your users, in-memory storage of session data is no longer sufficient. Sticky sessions are one way to go, although other, arguably better solutions exist, among them central session storage, which Redis can provide. See this post for a pretty comprehensive review of the subject.
Seeing as your question is about socket.io, though, I'll assume you probably meant Redis's specific importance for websockets, so:
When you have multiple socket.io servers (instances/processes), a given user will be connected to only one such server at any given time. However, any of the servers may, at any time, wish to emit a message to a given user, or even a broadcast to all users, regardless of which server they're currently under.
To that end, socket.io supports "Adapters", of which Redis is one, that allow the different socket.io servers to communicate among themselves. When one server emits a message, it goes into Redis, and then all servers see it (Pub/Sub) and can send it to their users, making sure the message will reach its target.
This, again, is explained in socket.io's documentation regarding multiple nodes, and perhaps even better in this Stack Overflow answer.

Load Balancing with Node and Heroku

I have a web app that accepts api requests from an ios app. My web app is hosted on Heroku using their free dyno which is able to process 512 mb of data per request. Because node is a single threaded application this will be a problem once we start getting higher levels of traffic from the ios end to the web server. I'm also not the richest person in the world so i'm wondering if it would be smart to create another free heroku app and use a round robin approach to balance the load received from the ios app?
I just need to be pointed into the right direction. Vertical scaling is not really an option financially.

I'm the Node.js platform owner at Heroku.
You may be doing some premature optimization. Node.js, on our smallest 1X size (512MB RAM), can handle hundreds of simultaneous connections and thousands of requests per minute.
If your iOS app is consistently maxing that out, it may be time to consider monetization!

As mentioned by Daniel it's against Heroku rules. Having said that there are probably other services that would allow you to do that.
One way to approach this problem is to use cluster module with ZeroMQ (you need to have ZeroMQ installed before using the module - see module description).
var cluster = require('cluster');
var zmq = require('zmq');
var ROUTER_SOCKET = 'tcp://127.0.0.1:5555';
var DEALER_SOCKET = 'tcp://127.0.0.1:7777';
if (cluster.isMaster) {
// this is the main process - create Router and Dealer sockets
var router = zmq.socket('router').bind(ROUTER_SOCKET);
var dealer = zmq.socket('dealer').bind(DEALER_SOCKET);
// forward messages between router and dealer
router.on('message', function() {
var frames = Array.prototype.slice.cal(arguments);
dealer.send(frames);
});
dealer.on('message', function() {
var frames = Array.prototype.slice.cal(arguments);
router.send(frames);
});
// listen for workers processes to come online
cluster.on('online', function() {
// do something with a new worker, maybe keep an array of workers
});
// fork worker processes
for (var i = 0, i < 100; i++) {
cluster.fork();
}
} else {
// worker process - connect to Dealer
let responder = zmq.socket('rep').connect(DEALER_SOCKET);
responder.on('message', function(data) {
// do something with incomming data
})
}
This is just to point you in the right direction. If you think about it you can create a script with a parameter that will tell it if it's a master or a worker process. Then on the main server run it as is, and on additional servers run it using worker flag which will force it to connect to the main dealer.
Now your main app needs to send the requests to the router, which will be later forwarded to the worker processes:
var zmq = require('zmq');
var requester = zmq.socket('req');
var ROUTER_SOCKET = 'tcp://127.0.0.1:5555';
// handle replies - for example completion status from the worker processes
requester.on('message', function(data) {
// do something with the replay
});
requester.connect(ROUTER_SOCKET);
// send requests to the router
requester.send({
// some object describing the task
});

So first off, as the other replies have pointed out, running two copies of your app to avoid Heroku's limits violates their ToS, which may not be a great idea.
There is, however, some good news. For starters (from Heroku's docs):
The dyno manager will restart your dyno and log an R15 error if the memory usage of a:
free, hobby or standard-1x dyno reaches 2.5GB, five times its quota.
As I understand it, despite the fact that your dyno has 512mb of actual RAM, it'll swap out to 5x that before it actually restarts. So you can go beyond 512mb (as long as you're willing to pay the performance penalty for swapping to disk, which can be severe).
Further to that, Heroku bills by the second and allows you to scale your dyno formation up and down as needed. This is fairly easy to do within your own app by hitting the Heroku API – I see that you've tagged this with NodeJS so you might want to check out:
Heroku's node client
the very-barebones-but-still-functional toots/node-heroku module
Both of these modules allow you to scale up and down your formation of dynos — with a simple heuristic (say, always have a spare 1X dyno running), you could add capacity while you're processing a request, and get rid of the spare capacity when api requests aren't running. Given that you're billed by the second, this can end up being very inexpensive; 1X dynos work out to something like 5¢ an hour to run. If you end up running extra dynos for even a few hours a day, it's a very, very small cost to you.
Finally: there are also 3rd party services such as Adept and Hirefire (two random examples from Google, I'm sure there are more) that allow you to automate this to some degree, but I don't have any experience with them.

You certainly could, I mean, programatically - but that would bypass Heroku's TOS:
4.4 You may not develop multiple Applications to simulate or act as a single Application or otherwise access the Heroku Services in a manner intended to avoid incurring fees.
Now, I'm not sure about this:
Because node is a single threaded application this will be a problem once we start getting higher levels of traffic from the ios end to the web server.
There are some threads discussing that, with some interesting answers:
Clustering Node JS in Heavy Traffic Production Environment
How to decide when to use Node.js?
Also, they link to this video, introducing Node.js, which talks a bit about benchmarks:
Introduction of Node JS by Ryan Dahl

How to scale socket.io without redis

I'm currently searching for an alternative to scale my express app with socket.io. The problem is that I don't want to use redis as socket.io store. Are there any other possibilities to cluster socket.io except with Clusterhub?
EDIT: I tried to use fakeredis as replacement for redis, but it seems like it doesn't work with socket.io. From ActionHero.js I know that faye-websocket works with fakeredis.

This might well depends on your socket.io usage and the type of scaling you want to achieve (cluster vs scaling to multiple machines).
So, here is what I did to scale our usage of socket.io to multiples servers.
We have 3 servers behind a load balancer, when a socket connects it connect to any of the 3 servers, the three server has an in memory list of the sockets, and the three servers have an order list of internal server address e.g. [server1, server2, server3].
What I do basically is a ring (internally we call it the "ring of sockets"):
If I need to emit an event to a socket from server1, I look first if the socket is connected to that server1, if not I send an http request to the next server (server2) which will check if the socket is there, if not there it will send the same request to server3, and so on until reaching the origin in which case you might throw an error.
Its almost the same if I need to broadcast a message, I start from one server and then call an http endpoint on the others.
The algorithm I use to determine the next node (next_node.js) is:
var nodes = process.env.NODES.split(',');
//this is usually: http://server1/,http://server2/,http://server3/
var url = require('url');
var current = require("os").hostname();
//origin is the node that started the lookup
exports.get = function (origin) {
var next_node_i = nodes.map(function (uri) {
return url.parse(uri).hostname;
}).reduce(function (prev, curr, i, arr){
return curr === current && i < arr.length - 1 ? i + 1 : prev;
}, 0);
var next_node = nodes[next_node_i];
if (origin && url.parse(next_node).hostname === origin) {
// if the next node is equal to the first node initiating the lookup
// it means the socket we are looking for is not connect to any node.
return null;
}
return next_node;
};
Caveats:
Latency is low between these server and network partitioning is unlikely, they are physically on the same datacenter. But if it were a network partitioning is not that important for us.
We always run the ring in the same direction. An improved version will be to run in both directions(?)
Servers share a secret to call these endpoints.
In my opinion this is a very easy way to achieve scaling in a lot of socket.io use cases, there might be a lot of other scenarios where this is not an option but I hope this give some ideas.

If your comfortable with Azure services, some of the guys on the Azure team have gone to the liberty of writing a service bus store for socket.io.
Glenn Block Explains Socket.IO Scale-Out on Service Bus

Possible to simulate several concurrent connections to test a nodejs app

I have a simple node.js /socket.io (websockets) application running #localhost. I am trying to see how many concurrent connections it can handle. Is it possible to simulate several concurrent users on localhost itself ?
This is my half baked attempt using socket.io-client:
function connectAndSend(){
socket.emit('qand',{
code :'ubuntu'
});
}
socket.on('connect', function () {
});
socket.on('q', function (data) {
console.log(data);
});
function callConnect(){
console.log('calling');
connectAndSend() ;
setTimeout(callConnect,100) ;
}
callConnect() ;
As I see it this only 'emits' a new message every 100 ms and is not simulating concurrent connections.

In your call to connect, you must tell socket.io to create a new connection for each call to connect. For example:
var socket = io.connect(server, { "force new connection": true });
Also, if you want to raise the outbound TCP connection limit (which seems to default to 5 connections per target), do something like
require('http').globalAgent.maxSockets = 1000;
before connecting.
But note that creating and closing tcp sockets at a fast rate will make TCP connections pile up in state TIME_WAIT and depending on your OS and your network settings you'll hit a limit pretty soon, meaning you'll have to wait for those old sockets to timeout before you can establish new connections.
If I recall correctly, the limit was around 16k connections (per target ip/port combo) on Windows (both Server 2008 R2 and Windows 7), and the default TIME_WAIT timeout in Windows is 4 minutes, so if you create more than 16k connections in 4 minutes on Windows, you'll probably hit that wall.

Check here:
Long connections with Node.js, how to reduce memory usage and prevent memory leak? Also related with V8 and webkit-devtools
and specifically - test procedure used by the author of question mentioned above
EDIT:
You can use following tools to check how many requests per second your server is capable of serving
ab - http://httpd.apache.org/docs/2.2/programs/ab.html
siege - http://www.joedog.org/siege-home/

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string