Using Node.js + Socket.io on Heroku with multiple dynos

Using Node.js + Socket.io on Heroku with multiple dynos - node.js

I am trying to build a realtime multiplayer game with Node / SocketIO on Heroku and am not sure how to handle multiple dynos with regards to sharing SocketIO connection data.
For example:
I have 2 Heroku dynos, each running Node + SocketIO
Player A hosts a game, and dyno 1 handles that connection
Player B attempts to joins the same game, but due to the Heroku router, dyno 2 ends up handling that connection.
Actions in the game need to happen in real time, so when Player A performs an action Player B needs to immediately see the results of that action.
In a single-dyno environment, this would be relatively simple. When Player A performs an action, it simply gets emitted to player B. How would this work when there are multiple dynos?

Since you aren't able to select (or know) which Heroku dyno you're connecting to (web.1, web.2, or web.n), you will need to find another way to communicate changes in the game across many dynos.
One way to do this would be to additionally use a distributed messaging service for communicating changes in the game. Using a distributed publush-subscribe architecture Player A can send actions in the game to dyno 1, which then puts them on a queue.
Dyno 2 can then subscribe to the game's queue when Player B joins the game. Now, Dyno 2 will receive actions from the game and then be able to send them over the socket to Player B.
While this approach isn't going to be as low of latency as a single high-performance server, something like it might be necessary if you want to get the benefits of scaling and redundancy.
On Heroku, you might consider using Redis for this purpose, since Heroku provides a hosted version and it performs well for low-latency applications.

Related

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.

You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Real time comunication between servers and clients

I have a socket game server that runs everything on one single process; the problem is when i want to scale out my app.
Since it is a card game and when there is an event on a table, i can easily reach all the players that are in the same room because i have direct access to their socket connection.
if i want another server (or many depending on the load) it is another complete different process and i need to be able to have for instance 1 room, where players from server 1 can play against players from server 2, and in case server 1 fails, the connections can be taken from server 2 and keep them playing without interruptions.
What would be the architecture for this?

Some hosting providers support both websockets and horizontal scaling. This will allow your users to establish a websocket connection with a node. However, you may need an event from that user to broadcast to other users connected to other notes.
You may want to consider something like RabbitMQ. By using a fanout or topic exchange you can broadcast the event to a set of listeners. The listeners will be the various nodes in your cluster that are maintaining the websocket connections.

Using Backbone.iobind (socket.io) with a cluster of node.js servers

I'm using Backbone.iobind to bind my client Backbone models over socket.io to the back-end server which in turn store it all to MongoDB.
I'm using socket.io so I can synchronize changes back to other clients Backbone models.
The problems starts when I try to run the same thing over a cluster of node.js servers.
Setting a session store was easy using connect-mongo which stores the session to MongoDB.
But now I can't inform all the clients on every change, since the clients are distributed between the different node.js servers.
The only solution I found is to set a pub/sub queue between the different node.js servers (e.g. mubsub), which seems like a very heavy weight solution that will trigger an event on all the servers for every change.

How did you reach the conclusion that pub/sub is a "very heavy weight solution"?
Sounds like you got it right up until that part :-)
Oh, and pub/sub is not a queue.
Let's examine that claim:
The nice thing about pub/sub is that you publish and subscribe to channels/topics.
So, using the classic chat server example, let's say you have a million users connected in total, but #myroom only has 50 users in it.
When a message is sent to #myroom, it's being published once. No duplication whatsoever.
In most use-cases you won't even need to store it on disk/RAM, so we're mostly looking at network/bandwidth here. And, I mean, you're probably throwing more data (probably over the wire?) to MongoDB already, so I assume that's not your bottleneck.
If you also use socket.io's rooms features (which is basically its own pub/sub mechanism), that means only 5 users will have that message emitted to them over the websocket.
And no, socket.io won't iterate over 1M clients to find out which of them are in room #myroom ;-)
So the message is published once, each subscriber (node.js instance) will get notified once, and only the relevant clients -- socket.io won't waste CPU cycles in order to find them as it keeps track of them when they join() or leave() a room -- will receive the message.
Doesn't that sound pretty efficient and light-weight?
Give Redis a shot.
It's really simple to set-up, runs entirely in memory, blazing-fast, replication is extremely simple, etc.
That's the way socket.io recommends passing events between nodes.
You can find more information/code here.
Additionally, if MongoDB can't handle the load at any point, you can use Redis as your session-store as well.
Hope this helps!

node.js server with socket.io handling 50000 simultaneous clients

We are developing a Javascript control which should be constantly connected to a server for receiving animation updates.
We are planning to host this stuff on an Amazon cloud.
The scenario is like this: server connects to activemq queue waiting for updates, for each update it broadcasts it to all connected clients.
Is it even possible to handle such load with node.js + socket.io?
Will a single node.js server be able to handle such load?
How to organize fast transport between different nodes if we will have to use more than one node?

Will single node.js server be able to handle such load?.. How to organize fast transport between different nodes if we will have to use more than one node
You say that you are planning to host on Amazon. So first off, nothing should be scoped for a single server. Amazon machines will simply "disappear", you have to assume that you are going to use multiple computers.
...handling 50k simultaneous clients
So to start with, 50k connections for a single box is a very big number. Here's a very detailed blog post discussing "getting to 10k" with node.js+socket.io.
Here's a very telling quote:
it seemed as though 10,000 clients simply required more serialization
than my server was able to handle.
So a key component to "getting to 50k" is going to be the amount of work required just pushing data over the wire.
How to organize fast transport between different nodes if we will have to use more than one node.
That blog post is the first of 3. When you're done the first, read the other two. That should point you in the right direction.

Scalable push application with node.js

I'm thinking about writing a few web applications having almost the same requirements as a chat. And I would like them to be able to scale easily.
I have worked a bit with node.js and I understand how it can help design push applications but I have some difficulties when thinking about having them run on multiple servers.
Here are some design I can think of for a large scale chat app :
1 - Servers have state, they keep the connections opened and clients can have new messages pushed to them. In this scenario, we are limited by the physical memory of one server so we cannot scale linearly if we have too many users per room.
2 - Servers have no state, they request a distributed database to respond to clients requests. In this scenario, clients poll the servers. We could scale linearly but the throughput is decreased, the messages are not delivered instantly and polling has been shown as a bad practice when scaling.
3 - Mix of 1 and 2. Servers keep the connections of its clients opened and poll the distributed database. The application is more complex to write and we still use polling. Similar client's requests (clients of the same room) are just grouped into a single one done by the server. The code becomes unnecessary complicated and it does not scale in the situation where we have many rooms and a few users per room.
4 - Servers have no state and the database cluster uses event to notify every registered servers about new messages. This is the solution I would like to have but I haven't heard of any database which has this feature. (Some people are talking about this feature for mongodb here: https://jira.mongodb.org/browse/SERVER-124)
So Why is the 4th solution not used so much today?
How do people usually design their applications in this case?

Since you want a push application, you would probably use Socket.IO with RedisStore.
By using this combination, the data for all the connections is kept in Redis (in-memory database), so you can scale outside a process. Another use of Redis here is for pub-sub.
The idea is to trigger an event when something needs to be pushed, then sent a message to the browser using Socket.io. If you want to listen to database changes, perhaps it's better to use CouchDB with it's _changes feature.
Resources:
https://github.com/dshaw/talks/tree/master/2011-10-jsclub/sample-app
http://www.ranu.com.ar/2011/11/redisstore-and-rooms-with-socketio.html
How to reuse redis connection in socket.io?

Instead of triggers for case 4, you might want to hook into MongoDB replication.
Let's assume you have a replica set (you wouldn't run single mongod, would you?).
Every change is written to the oplog on the primary and then is replicated to secondaries.
You can efficiently pull new updates from the oplog, using tailable cursors. Note, this is still pull, not push.
Then your node.js will push these events to the clients.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string