How to scale a NodeJS stateful application - node.js

I am currently working on a web-based MMORPG game and would like to setup an auto-scaling strategy based on Docker and DigitalOcean droplets.
However, I am wondering how I could manage to do so:
My game server would have to be splittable across different Docker containers BUT every game server instance should act as if it was only one gigantic game server. That means that every modification happening in one (character moving) should also be mirrored in every other game server.
I am trying to get this to work (at least conceptually) but can't find a way to synchronize all my instances properly. Should I use a master only broadcasting events or is there an alternative?
I was wondering the same thing about my MySQL database: since every game server would have to read/write from/to the db, how would I make it scale properly as the game gets bigger and bigger? The best solution I could think of was to keep the database on a single server which would be very powerful.
I understand that this could be easy if all game servers didn't have to "share" their state but this is primarily thought so that I can scale quickly in case of a sudden spike of activity.
(There will be different "global" game servers like A, B, C... but each of those global game servers should be, behind the scenes, composed of 1-X docker containers running the "real" game server so that the "global" game server is only a concept)

The problem you state is too generic and it's difficult to give a concrete response. However let me be reckless and give you some general-purpose scaling advices:
Remove counters from databases. Instead primary keys that are auto-incremented IDs, try to assign random UUIDs.
Change data that must be validated against a central point by data that is self contained. For example, for authentication, instead of having the User Credentials in a DB, use JSON Web Tokens that can be verified by any host.
Use techniques such as Consistent Hashing to balance the load without need of load balancers. Of course use hashing functions that distribute well, to avoid/minimize collisions.
The above advices are basically about changing the design to migrate from stateful to stateless in as much as aspects as you can. If you anyway need to provide stateful parts, try to guess which entities will have more chance to share stateful data and allocate them in the same (or nearly server). For example, if there are cities in your game, try to allocate in the same server the users that are in the same city, since they are more willing to interact between them (and share stateful data) than users that are in different cities.
Of course if the city is too big and it's very crowded, you will probably need to partition the city in more servers to avoid overloading the server.

Your question is too broad and a general scaling problem as others have mentioned. It'd have been helpful if you'd stated more clearly what your system requirements are.
If it has to be real-time, then you can choose Redis as your main DB but then you'd need slaves (for replication) and you would not be able to scale automatically as you go*, since Redis doesn't support that. I assume that's not a good option when you're working with games (Sudden spikes are probable)
*there seems to be some managed solutions, you need to check them out
If it can be near real-time, using Apache Kafka can prove to be useful.
There's also a highly scalable DB which has everything you need called CockroachDB (I'm a contributor, yay!) but you need to run tests to see if it meets your latency requirements.
Overall, going with a very powerful server is a bad choice, since there's a ceiling and it'd cost you more to scale vertically.

There's a great benefit in scaling horizontally such an application. I'll try to write down some ideas.
Option 1 (stateful):
When planning stateful applications you need to take care about synchronisation of the state (via PubSub, Network Broadcasting or something else) and be aware that every synchronisation will take time to occur (when not blocking each operation). If this is ok for you, lets go ahead.
Let's say you have 80k operations per second on your whole cluster. That means that every process need to synchronise 80k state changes per second. This will be your bottleneck. Handling 80k changes per second is quiet a big challenge for a Node.js application (because it's single threaded and therefore blocking).
At the end you'll need to provision precisely the maximum amount of changes you want to be able to sync and perform some tests with different programming languages. The overhead of synchronising needs to be added to the general work load of the application. It could be beneficial to use some multithreaded language like C, Java/Scala or Go.
Option 2 (stateful with routing):*
In some cases it's feasible to implement a different kind of scaling.
When for example your application can be broken down into areas of a map, you could start with one app replication which holds the full map and when it scales up, it shares the map in a proportional way.
You'll need to implement some routing between the application servers, for example to change the state in city A of world B => call server xyz. This could be done automatically but downscaling will be a challenge.
This solution requires more care and knowledge about the application and is not as fault tolerant as option 1 but it could scale endlessly.
Option 3 (stateless):
Move the state to some other application and solve the problem elsewhere (like Redis, Etcd, ...)

Related

How to distribute NodeJS requests to several servers and merge the results

I have a simple NodeJS web app that calls several apis asynchronously and merges the results to return one big result. Now let's say that I want to optimize this. How do I do this?
I am new to NoeJS and also the concept of scaling systems. I have been reading about load balancing, distributed systems, etc... I think this is the right way to go, but honestly I don't know.
I was thinking of doing something like this -
Set up a system that has several servers, and each has an instance of a NodeJS webapp that makes an api call given a path, and returns the result.
Have a master server that grabs the result from each of these servers, and merge the result and return it to the client.
Is this right way to go? What technologies do I use? Thank you for your help.
I am guessing you are trying to setup web-crawling or api-crawling, to grab data from 3rd party end point. If that is true, you would have a list of users / IDs or something like that that you pass to the web service you call and grab the data.
First of making a large number of requests very fast and in a stable way is tricky and depends on several factors to be stable and robust.
Is the 3rd party API rate limited.
Network connection on the client machine making the requests.
Error handling for both API and client errors like connection reset etc.
Sheer volume of data you are fetching back, like if you are trying to crawl data on millions of users from 3rd party API as fast as possible.
Your instinct is correct that you would have to scale this over several servers or at-least several parallel node processes on machine with lot of resources, however start small, test, and then scale would be my recommendation. Here are a few steps.
Use a good robust node http client like axios
If you are dealing with huge number of items (username, ids. emails etc) you will need stable way of iterating over them. Put them in a database like PostgreSQL or MySQL.
From here on figure out what's the fastest rate at which your API supports calling. And write stable function to iterate over your 'input' and call the API.
Then you have a couple of options. If data you are collecting is separate for each request you make. You can save it back in the database for each input. If you literally want to merge the data from multiple API calls, you can use a key-value storage like Redis. You can give an ID to each call and create a combination key for input+request_id format, then when all requests are done, you can merge them.
When you a small scale model in place you can now add a good job manager like Kue or Bull to the mix, and split the set of inputs in database from point (2) over several jobs that can be run in parallel.
Once you have a stable job-manager for that can repeat this node process for a range of inputs , now you are at a point where you can scale.
Deploy this same code on multiple servers that all talk to same Database and Redis. Install the Node process to run using a process manager like PM2.
Finally the way setup works is, each copy of same node program fetches a different set of inputs (usernames/IDs etc) form the source database, and writes the results back to the database or Redis depending on how you want to handle the output.
Optional post processing on redis to fetch the key value pairs and merge the responses grouped by input.
Some important things you have to be hyper aware of when coding this issues are:
Memory Management: Use design patterns/code/libraries that saves you most memory. Load absolutely minimum of what you need to in memory. Eg: iterating on an array of 1 millions usernames in memory is more expensive than keeping them in database and paging over them.
Error Handing: There will be lots of them. API errors, unforeseen exceptions, memory leaks, network drops etc. Having robust error handling and recovery mechanism will save the day.
Logging: Good quality logging will be critical to keep a check on how different parts of system are doing. Look at winston.
Throttling API calls: Remember making 10,000 API calls at the same minute will likely crash your machine or even most APIs.At the very least go very slow due to memory overloads. However adding a slight delay (like 10 milliseconds) between every 10 parallel calls will be HUGE boost in speed and make the calls much more stable. This strategy is called throttling or rate-limiting the API calls. Finding a sweet spot that works for your problem is important. Yes going slow can actually make you reach goal faster!
Your question was quite broad without specific code question, this is a general strategy and hopefully will give you a good starting point and links to reference materials so you can start building your solution.

Hazelcast : server node and client node best practices for beginner

new to hazelcast, want to understand functionalities of client and server functions in a cluster.
lets say I have 4 different servers(not referring to hazelcast server)/machines and I want to maximize RAM utilization :-
Do I start 4 servers instances, one on each server/machine ?
Do I start 4 clients instances, one on each server/machine ?
Is business logic written only in client instance ? if so, then do server instance not contain any logic apart from managing the lifecycle ?
I know this would vary as per requirement, but I want to get a general idea.
Adding on to Ernest's statements. You would usually expect data to be held in cache and processing to be on the client. However, with hazelcast, it doesn't have to be that way. Check some interesting features like ExecutorService and EntryProcessors in the documentation.
You may also want to look at the concept of Near cache, where you still hold the data on dedicated Hz instance (servers), while maintaining a near-cache in the client. Be wary of data sync challenges around this, though this works well in most cases (again very subjective).
Hope these pointers give some idea to start off with. All the Best !
there is no single answer to your question. There are many factors to be considered. For example one of your questions is where does the business logic reside. This depends heavily on how the hazelcast is used. Lets say Hazelcast is used purly for Caching purposes. The business logic then resides entirly on the client side.
Alternativly if we say that Hazelcast is full of rich Pojos, and domain driven design is used then we can say the logic lies entirly on the hazelcast instance itself. Usualy in real life the truth is somewhere on between
In terms of memory utilization again this depends very much on your setup budget and so on.... We can say that if you have one server with a lot of ram and you don't use any commersial addons from Hazelcast like off memory heap then running several hazelcasts on the same machine with limited amount of memory each would be more beneficial compared to running a single node with a lot of memory.
Also it should be noted the phenomenon where allocating more than 32Gigs of heap will drive you into te 64 bit universe.
Again this depends on many factors. If you have a Live interactive application you can not tolerate big GC pausas so you would incline to usage of more hazelcasts with small heaps. If you have non interactive application tolerant to big GC pauses then it is the other way around you can have big heap. So you see there is no simple answer to your question.

Options for getting a CPU intensive job off my web server?

I have been working on a Web App for visualizing live data. It is crucial that this data is kept up to date on the client side without such updates being invoked directly by the client (e.g. no button presses or refreshing the page). Currently, on page load, I grab the current data set from a database (DynamoDB) via Ajax, and subsequent updates are pushed to any listening clients every 5 minutes via a Websockets connection (using Socket.io).
I have overlooked the computational load of this update job. It has to mine some data, process it, update the database, and send the update out to all clients. As a result, the web server is left unresponsive for about 30 seconds with each update. Furthermore, my current architecture limits me from putting my server behind a load balancer, which is something I anticipate coming up in the future. For both these reasons, I really need to get this update job off my web server.
I am relatively inexperienced in web development, and I don't feel I am knowledgeable enough about these technologies to know the drawbacks of the solutions I have come up with. Currently, I am considering:
Break the update off into a separate process so it does not block the Node event loop. This would solve my issue in the short term, but if I ever want to load balance my application, I can't have the update running on multiple machines.
Drop Websockets entirely and just have the client query the database every 5 minutes, while a separate process (or separate server if I want load balancing) keeps the database up to date without interacting directly with the client. Will this kind of access pattern put too much load on my db?
Have a separate server run the update, and send the result via Websockets (or maybe some other protocol) to my load balanced application servers, which then push that update to all listening clients as usual. Is this even possible?
Perhaps there are other solutions. It seems like this would be a relatively common problem, so I was hoping I could find some guidance here. What are the potential issues with the solutions I have proposed, and are there other possible solutions that my suit my use case better?
It sounds like you want one process sitting somewhere which crunches the data and publishes it to a stream. Clients can then subscribe to the stream as and when they like. Redis handles streams nicely, you could process your data and push it into a redis stream. You could then create a small node service which subscribes to the redis stream and pushes the formatted data out over a websocket or via polling.
In this scenario you can then scale up either the publishing process (the one crunching the numbers) if your data load goes up, or scale up your subscribed process (which serves the data over a websocket to browsers) if you get an influx of clients watching the data.
You can also easily distribute the hosting of these services across other machines, and even write them in different languages if you decide the number crunching needs something like threading.
You're then left with the issue of clients (web browsers) consuming this data with a load balance in-between. This can be a hard problem if you use websockets and is bundled with pros and cons. But importantly you'll have separated your data crunching from your result publishing and that'll isolate out your issue to only the load balancing.
I have done pretty much the same to check ressources on some of our servers.
I have a C# service getting the information on each server that we manage, sending them to a queue (Amq).
From there, I have a stomp client fetching data from amq and emiting them to a websocket.
My main micro service is fetching the data to save them into a db.
My visualisation webapp is connected to the same ws and is fetching the data as they are sent to display them.
The Amq step isn't mandatory at all, it's just something I had to work with (historical).
I don't know what type of data your are working with, so I don't know if my solution can apply to you.
Don't hesitate if I'm not clear or you have any question.
This is a big question and I'm not going to try and give you a definitive answer.
For option 2
It really depends on how expensive your queries are. You can make DynamoDB fast if you pay for enough throughput. That said, on the face it, re-loading your whole dataset, when that sounds like its probably large, probably isn't good engineering.
For option 3
This option seems best to me if its achievable, although admittedly its hard to say with such a complex system - obviously you can't share your whole project.
Given your are already using AWS you might want to look into AWS Lambda. If you can move the update process into a stand alone job, you can host it on lambda and move the load off the web server. Lambda is essentially infinitely scalable and you only pay for the compute you use.
This really depends on you being able to split the update task off into a separate service. Its likely you would need a fair bit of refactoring to isolate it as a service. If you can break little bits off at a time, and make the move gradually, even better.
If you consider trying this, and you've not used Lambda before, I would definitely start small with some hello world examples. Then try a very simple service in your application, and build up to taking on the update service.
You might also consider looking in AWS Simple Message Queue Service to handle the comms between clients and server.
Database tuning
If a lot of your update time is spent waiting for database actions to complete, rather than server processing, you can consider tuning that side of things up. Things to consider are:
Buying more throughput
Using batch operations (as these move load to DynamoDB from your server)
Tuning keys, indexes and database access

Azure Websites and stateful webApp

I have a naïve version of a PokerApp running as an Azure Website.
The server stores in its memory the state of the tables, (whose turn it is, blinds value, cards…) etc.
The problem here is that I don't know how much I can rely on the WebServer's memory to be "permanent". A simple restart of the server would cause that memory to be lost and therefore all the games in progress before the restart would get lost / cause trouble.
I've read about using TableStorage to keep session data and share it between instances, but in my case it's not just string of text that I want to share but let's say for example, a Lobby objcet which contains all info associated with the games.
This is very roughly the structure of the object I have in memory
After some of your comments, you can see the object that needs to be stored is quite big and is being almost constantly. I don't know how well serializing and deserializing is going to work for me here...
Should I consider an azure VM which I'm hoping is going to have persistent memory instead of a Website?
Or is there a better approach to achieve something like this?
Thanks all for the answers and comments, you've made it clear that one can't rely on local memory when working on the cloud.
I'm going to do some refactoring and optimize the "state" object and then use a caching service.
Two question come to my mind though, and once you throw some light on these ones I promise I will shut up and accept #astaykov's great answer.
CONCURRENCY AT INSTANCE LEVEL - I have classic thread locks in my app to avoid concurrency problems, so I'm hoping there is something equivalent for those caching services you guys propose?
Also, I have a few timeouts per table (increase blinds, number of seconds the players have to act…). Let's say a user has just folded a hand, he's finished interacting with the state object so I update the cache. While that state object (to which the timers belong) is cached, my timers will stop ticking…
I know I'm not explaining myself very well here but I hope you guys see my point.
I'd suggest using the Azure Redis Cache.
Here is a nice sample how to build MVC App with Redis Cache in 15 minutes.
You can, of course use the Azure Managed Cache. Or end up with Azure Tables. And Azure Tables can hold much more then just a string. But I believe the caching solutions would have lower latency in communication.
In either way, your objects have to be serializable. And yes - the objects will get serialized/deserialized by every access. You can do it manually, or let the framework do it for you. From what I've read, NewtonSoft.JSON is quite good and optimized JSON serializerdeserializer.
UPDATE
When you ask for a VM running in the cloud - a VM will be restarted sooner or later! Application Pool will recycle, a planned maintenance will occur, an unplanned maintenance will occur, a hard disk will fail, a memory module will fail, unforeseen disaster will happen.
Only one thing is for sure - if you want your data to survive server crashes, change the way you think and design software, and take data out of (local) the memory. Or just live the fact that application may lose state sometime.
Second update - for the clocks
Well, you have to play with your imagination and experience. I would question that your clocks work anyway in the context of the ASP.NET app (unless all of them being static properties of a static type, which would be a little hell). My approach would be heavily extend my app to the client as well (JavaScript). There are a lot of great frameworks out there - SignalR, AngularJS, KnockoutJS, none of them to be underestimated! By extending your object model to the client, you can maintain players objects lifetime on the client (keeping the clock ticking) and sending updates from the client to the server for all those events. If you take a look at SignalR, you can keep real time communication between multiple clients (say players) and the server. And the server side of SignalR scales out nicely with Azure Service Bus and even Redis.

High Scalability in Domain-Driven Design

I'm using DDD for a service-oriented application intended to transmit a high volume of messages between a high volume of web clients (i.e., browsers).
Because in the context of required functionality, the need for transmission outweighs the need for storage, I love the idea of relying on RAM primarily and minimizing use of the database.
However I'm unclear on how to architect this from a scalability point of view. A web farm creates high availability of service endpoints and domain logic processing. But no matter how many servers I have, it seems they must all share a common repository so that their data is consistent.
How do I build this repository so that it's as scalable as possible? How can it be splashed across an array of physical machines in a manner such that all machines are consistent and each couldn't care less if another goes down?
Also since touching the database will be required occasionally (e.g., when a client goes missing and messages intended for it must be stored until it returns), how should I organize my memory-based code and data access layer? Are they both considered "the repository"?
There are several ways to solve this issue. No single answer can really cover it all...
One method to ensure your scalability is to simply scale the hardware. Write your web services to be stateless so that you can run a web farm (all running the same identical services, pointing to the same DB) and turn your DB into a cluster. Clustered databases run over multiple servers and work on the same storage. However, this scenario can get complicated and expensive quite quickly.
Some interesting links:
http://scale-out-blog.blogspot.com/2009/09/future-of-database-clustering.html
http://en.wikipedia.org/wiki/Server_farm
Another method is to look at architecture. CQRS is a common architectural model that ensures scalability. For instance, this architecture model -- its name stands for Command/Query Responsibility Segregation -- builds different databases for reading and writing. This seems contradictory, but if you study it, it becomes natural and you wonder why you've never thought of it before. Simply put, most apps do a lot more reading than writing, and writing tends to be a lot more complicated than reading (requiring business rule validation etc.), so why not separate the two? You can use your expensive transactional database for writing and then your cheap, maybe Non-SQL based or open source, database over multiple reading servers. Your read model is then optimized for the screens of your application(s), whereas the write model is optimized solely for writing and is in fact a DDD-based set of repositories.
There's just not enough room here to cover this option in detail, but CQRS is a good way of achieving scalability and even ease of development, once you have a CQRS framework in place. There are many other advantages to CQRS, such as ease of auditing (if you combine it with the "event sourcing" technique, which is common in CQRS-based environments).
Some interesting links:
http://cqrsinfo.com
http://abdullin.com/cqrs
http://blog.fossmo.net/post/Command-and-Query-Responsibility-Segregation-(CQRS).aspx
Are you ready for some reading? There are a lot of options, but I believe you should start by learning about the advantages of modern distributed NoSQL dbs, and enjoy learning from the experience learned in facebook, linkedin and other friends. Start here:
http://highscalability.com/
http://nosql-database.org/

Resources