Node socket.io on load balanced Amazon EC2 - node.js

I have a standard LAMP EC2 instance set-up running on Amazon's AWS. Having also installed Node.js, socket.io and Express to meet the demands of live updating, I am now at the stage of load balancing the application. That's all working, but my sockets aren't. This is how my set-up looks:-
--- EC2 >> Node.js + socket.io
/
Client >> ELB --
\
--- EC2 >> Node.js + socket.io
[RDS MySQL - EC2 instances communicate to this]
As you can see, each instance has an installation of Node and socket.io. However, occasionally Chrome debug will 400 the socket request returning the reason {"code":1,"message":"Session ID unknown"}, and I guess this is because it's communicating to the other instance.
Additionally, let's say I am on page A and the socket needs to emit to page B - because of the load balancer these two pages might well be on a different instance (they will both be open at the same time). Using something like Sticky Sessions, to my knowledge, wouldn't work in that scenario because both pages would be restricted to their respective instances.
How can I get around this issue? Will I need a whole dedicated instance just for Node? That seems somewhat overkill...

The issues come up when you consider both websocket traffic (layer 4 -ish) and HTTP traffic (layer 7) moving across a load balancer that can only inspect one layer at a time. For example, if you set the ELB to load balance on layer 7 (HTTP/HTTPS) then websockets will not work at all across the ELB. However, if you set the ELB to load balance on layer 4 (TCP) then any fallback HTTP polling requests could end up at any of the upstream servers.
You have two options here. You can figure out a way to effectively load balance both HTTP and websocket requests or find a way to deterministically map requests to upstream servers regardless of the protocol.
The first one is pretty involved and requires another load balancer. A good walkthrough can be found here. It's worth noting that when that post was written HAProxy didn't have native SSL support. Now that this is the case it might be possible to just remove the ELB entirely, if that's the route you want to go. If that's the case the second option might be better.
Otherwise you can use HAProxy on its own (or a paid version of Nginx) to implement a deterministic load balancing mechanism. In this case you would use IP hashing since socket.io does not provide a route-based mechanism to identify a particular server like sockjs. This would use the first 3 octets of the IP address to determine which upstream server gets each request so unless the user changes IP addresses between HTTP polls then this should work.

The solution would be for the two(or more) node.js installs to use a common session source.
Here is a previous question on using REDIS as a common session store for node.js How to share session between NodeJs and PHP using Redis?
and another
Node.js Express sessions using connect-redis with Unix Domain Sockets

Related

Horizontal scaling with a node.js app & socket io

My team and I are working on a digital signage platform.
We have ~ 2000 Raspberry Pi around the world connected to a Nodejs server using Socket IO. The Raspberries are initiating the connection.
We would like to be able to scale horizontally our application on multiple servers but we have a problem that we can’t figure out.
Basically, the application stores the sockets of the connected Raspberry in an array.
We have an external program that calls the API within the server, this results by the server searching which sockets will be "impacted" by the API call and send them the informations.
After lots of search, we assume that we have to stores the sockets (or their ID) elsewhere (Redis ?), to make the application stateless. Then, any server can respond to a API call and look the sockets in a central place.
Unfortunately, we can’t find any detailed example on how to do that.
Can you please help us ?
Thanks
(You can't store sockets from multiple server instances in a shared datastore like redis: they only make sense in the context of the server where they were initiated).
You will need a cluster of node.js servers to handle this. There are various ways to make a cluster. They all involve directing incoming connections from your RPis to a "generic" hostname, for example server.example.com. Behind that server.example.com hostname will be multiple node.js servers.
Each incoming connection from each RPi connects to just one of those multiple servers. (You know this, I believe.) This means one node.js server in your cluster "owns" each individual RPi.
(Telling you how to rig up a cluster of node.js servers is beyond the scope of this answer. Hints: round-robin DNS or a reverse-proxy nginx front end.)
Then, you want to route -- to fan out -- the incoming data from each API call to each server in the cluster, so the server can route it to the RPis it owns.
Here's a good way to handle that:
Set up a redis cache or other shared data store. It can be very small.
When each node.js server starts, have it register itself as active. That is, have it place its own specific address for handling API calls into the shared server. The specific address is probably of the form 12.34.56.78:3000: that is, an IP address and port.
Have each server update that address every so often, once a minute or so, to show it is still alive.
When an API call arrives at server.example.com, it will come to a more-or-less randomly chosen node.js server instance.
Get that server to read the list of server addresses from the redis cache
Get that server to repeat the API call to all servers except itself. Add a parameter like repeated=yes to the repeated API calls.
Then, each server looks at its list of connected sockets and does what your application requires.
On server shutdown, have the server unregister itself -- remove its address from redis -- if possible.
In other words, build a way of fanning out the API calls to all active node.js servers in your cluster.
If this must scale up to a very large number (more than a hundred or so) node.js servers, or to many hundreds of API calls a minute, you probably should investigate using message queuing software.
SECURE YOUR REDIS server from random cybercreeps on the internet.

Load balancing sockets on a horizontally scaling WebSocket server?

Every few months when thinking through a personal project that involves sockets I find myself having the question of "How would you properly load balance sockets on a dynamic horizontally scaling WebSocket server?"
I understand the theory behind horizontally scaling the WebSockets and using pub/sub models to get data to the right server that holds the socket connection for a specific user. I think I understand ways to effectively identify the server with the fewest current socket connections that I would want to route a new socket connection too. What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
I don't imagine this answer would be tied to a specific server implementation, but rather could be applied to most servers. I could easily see myself implementing this with vert.x, node.js, or even perfect.
First off, you need to define the bounds of the problem you're asking about. If you're truly talking about dynamic horizontal scaling where you spin up and down servers based on total load, then that's an even more involved problem than just figuring out where to route the latest incoming new socket connection.
To solve that problem, you have to have a way of "moving" a socket from one host to another so you can clear connections from a host that you want to spin down (I'm assuming here that true dynamic scaling goes both up and down). The usual way I've seen that done is by engaging a cooperating client where you tell the client to reconnect and when it reconnects it is load balanced onto a different server so you can clear off the one you wanted to spin down. If your client has auto-reconnect logic already (like socket.io does), you can just have the server close the connection and the client will automatically re-connect.
As for load balancing the incoming client connections, you have to decide what load metric you want to use. Ultimately, you need a score for each server process that tells you how "busy" you think it is so you can put new connections on the least busy server. A rudimentary score would just be number of current connections. If you have large numbers of connections per server process (tens of thousands) and there's no particular reason in your app that some might be lots more busy than others, then the law of large numbers probably averages out the load so you could get away with just how many connections each server has. If the use of connections is not that fair or even, then you may have to also factor in some sort of time moving average of the CPU load along with the total number of connections.
If you're going to load balance across multiple physical servers, then you will need a load balancer or proxy service that everyone connects to initially and that proxy can look at the metrics for all currently running servers in the pool and assign the connection to the one with the most lowest current score. That can either be done with a proxy scheme or (more scalable) via a redirect so the proxy gets out of the way after the initial assignment.
You could then also have a process that regularly examines your load score (however you decided to calculate it) on all the servers in the cluster and decides when to spin a new server up or when to spin one down or when things are too far out of balance on a given server and that server needs to be told to kick several connections off, forcing them to rebalance.
What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
As described above, you either use a proxy scheme or a redirect scheme. At a slightly higher cost at connection time, I favor the redirect scheme because it's more scalable when running and creates fewer points of failure for an existing connection. All clients connect to your incoming connection gateway server which is responsible for knowing the current load score for each of the servers in the farm and based on that, it assigns an incoming connection to the host with the lowest score and this new connection is then redirected to reconnect to one of the specific servers in your farm.
I have also seen load balancing done purely by a custom DNS implementation. Client requests IP address for farm.somedomain.com and that custom DNS server gives them the IP address of the host it wants them assigned to. Each client that looks up the IP address for farm.somedomain.com may get a different IP address. You spin hosts up or down by adding or removing them from the custom DNS server and it is that custom DNS server that has to contain the logic for knowing the load balancing logic and the current load scores of all the running hosts.
Route the websocket requests to a load balancer that makes the decision about where to send the connections.
As an example, HAProxy has a leastconn method for long connections that picks the least recently used server with the lowest connection count.
The HAProxy backend server weightings can also be modified by external inputs, #jfriend00 detailed the technicalities of weighting in their answer.
I found this project that might be useful:
https://github.com/apundir/wsbalancer
A snippet from the description:
Websocket balancer is a stateful reverse proxy for websockets. It distributes incoming websockets across multiple available backends. In addition to load balancing, the balancer also takes care of transparently switching from one backend to another in case of mid session abnormal failure.
During this failover, the remote client connection is retained as-is thus remote client do not even see this failover. Every attempt is made to ensure none of the message is dropped during this failover.
Regarding your question : that new connection will be routed by the load balancer if configured to do so.
As #Matt mentioned, for example with HAProxy using the leastconn option.

Does the ws websocket server library requires sticky session when it is used behind a load balancer?

I am trying to setup websocket servers behind a load balancer. At first, I used the socket.io library. But I found that it requires sticky session when used behind a load balancer.
According to this website, it sends multiple requests to perform handshake and establish a connection. If the requests are sent to different servers, the connection will fail.
After further study, i found that other websocket server library like SockJS also have the same problem. They all require sticky session to work behind a load balancer.
Now I am checking the websocket library ws. But I could not find any example of using it behind load balancer.
Does the ws library requires sticky session to work?
Is there any other websocket library that can work without sticky session behind a load balancer?
Is there a specific reason why you can't / don't want to rely on sticky sessions?
If you want to distribute socket connections across multiple hosts you are going to need some solution, and sticky sessions is a perfectly good one.
The socket.io page on using multiple nodes you link to even describes a way to implement the solution, "by routing clients based on their originating address" via NginX. Have you tried this and found that it doesn't work?
There is also a very good article on Horizontally Scaling Node.js and WebSockets with Redis which describes solving the exact issue you have with sticky sessions and automatic failover.

Bouncy or nginx for load balancing websockets?

I want to add a load balancer infront of my nodejs websockets server. The plan is to add another node on another physical machine and have a load balancer in front. The load balancer will also be on its own physical machine.
The requirement is that several 1000s of simultaneous connections could be handled and I'm a bit worried about bouncys upper limitations.
I like the consistency of using bouncy since it is a node module, but at the same time it seems like nginx could handle more socket connections or be a bit more stable.
Anyone who has experience with bouncy or nginx as load balancer and could give me some advices?
Thanks!
nginx is pretty good for mass connections, check these answer.
https://stackoverflow.com/a/16289251/2325522
there you can see how to use Nginx as load balacer.
The only problem that you can have is the mass band-width needed to serve 1000's of simultaneous connections.
Example:
5000 clients * 0.25Mb/request (a little one)
=
1250mb (1.25Gb outgoing band-width)
Hope these solve your doubts.

Node.js Reverse Proxy/Load Balancer

I am checking node-http-proxy and nodejs-proxy to build a DIY reverse proxy/load balancer in Node.js. After coding a small version, I setup 2 WEBrick servers for the same Rails app so I could load balance (round robin) between them. However each HTTP request is sent to one or another server which is very inefficient since the loading process of CSS and Javascript files from the home page is performed with more than 25 GET requests.
I tried to play a bit with socket events but I didn't get anywhere because by default it uses keep-alive connections (possibly this is why nginx just support http/1.0).
Ok, so I am wondering how can my proxy send a block of HTTP requests (for instance loading a webpage entirely, etc) to only one server so I could send the next block to another server.
You need to consider stickiness or session persistence. This will ensure future connections after the first connection inbound will get 'stuck' to the chosen server for the duration of the session or until the persistence connection times out.

Resources