My website is written in Node.js, has no database or external dependencies, but does have lot of large media files (images and some video) totalling some 2gb. The structure of the website is drawn from a couple of simple JSON files.
My problem is drastic and sudden scaling. Traffic to my site is usually easily handled by any small VPS instance, but occasionally traffic can get to hundreds of times it normal level for short periods. My problem is how to scale quickly, without downtime, and automatically. I know there are issues with autoscaling, but perhaps lacking a database will negate some of that.
What sort of scaling issues and options should I be looking at?
(For context, I am currently using a Digital Ocean VPS, but I can't find a clean way to scale it with no downtime. I am not wedded to my provider.)
Scalability is important, but scaling when you need to is also important. We all do not have the scaling needs of Facebook or Twitter : ) This might just be a case of resource management.
Test the problem
Without a database and using NodeJS, some of the strengths of node are its number of concurrent connections. For simple io load, it would seem you have picked a good choice of framework. And, since your problem set is a particular resource being bombarded, run some load testing on your server. Popular and free tools include:
Apache Bench
httperf
OpenLoad
And there are pay service like NeoLoad, LoadImpact (which is free at small levels), forecastweb, E-Load, etc..
With those results, Determine the Cause
Is it the size of the file being served? Is it the number of concurrent requests? What resources are being used, or maxed out, during a slowdown (ram, ports, file system, some other IO, CPU, bandwidth, etc...)?
Have a look at this question, which defines a few concepts for server load. To implement a solution, you will need to determine the cause of the slowdown. Is it: 1)Some queues fill up? 2)Problem with TCP Connections and Ports? 3) Too slow allocation of resources? That will help shape your solution.
Plan for scaling.
The type of scaling needed for your project may only be the portion needed for another. If you know the root cause in this case, it will increase your options.
Is the problem bandwidth? Perhaps using your web server as a router to multiple cloud instances of file serving would effectively increase the bandwidth your users see. Even just storing your files on a larger cloud that can guarantee the bandwidth you may need.
Is the problem CPU, RAM, etc? You may need multiple instances of the same web app (or an increased allotment for your VFS). This is the "Elastic" portion of Amazon's Elastic Cloud Computing (EC2), and other models like it. Create a "golden image" and duplicate when you see traffic start spiking, using built-in monitoring tools, turning it off when the rush is done. Can be programatic or simply manual.
Is the problem concurrent requests? The bottleneck should not be NodeJS, up to 1000's of concurrent requests anyway. Perhaps just check your implementation to ensure there is not a slowdown of the single node thread. Maybe node clustering or some worker threads would alleviate the bottleneck enough for your purposes.
Last Note: For serving static files I've heard nginx or even Apache Tomcat is a little more well-suited than NodeJS. Depending on your web app's complexity, you might be able to switch or benchmark fairly easily.
In case anyone is reading this rather specific question years later, I have gained some perspective on it. As Clay says, the ultimate answer is to spin up more servers, either manually or programatically based on load.
However, in my case that would be massive overkill - I'm not running Twitter. The problem was a relatively simple mistake in architecture. My app was reading the JSON data files from disk with every page request, and the disk I/O was getting saturated. I changed to loading the data files into memory on startup, and reloading them when they change using fs.watch().
My modest VPS can now easily handle the sorts of traffic that would previously crash it. I've never seen traffic that would make me want to up-size it.
Related
In a Next.js app, what would be the most efficient (fastest) way to retrieve the users country?
Among other things, I would use it to determine which scripts are loaded using next/script.
I looked in to node-geoip and fast-geoip, but even though fast-geoip has a very thorough explanation below, I do not understand the mechanisms behind Next.js/Node.js to evaluate the methods properly.
Concretely, what geoip-lite does is that, on startup, it reads the whole database from disk, parses it and puts it all on memory, thus this results in the startup time being increased by about ~233 ms along with an increase of memory being used by the process of around ~110 MB, in exchange for any new queries being resolved with low sub-millisecond latencies (~0.02 ms).
This works if you have a long-running process that will need to geolocate a lot of IPs and don't care about the increases in memory usage nor startup time, but if, for example, your use-case requires only geolocating a single IP, these trade-offs don't make much sense as only a small part of the satabase is needed to answer that query, not all of it.
This library tries to provide a solution for these use-cases by separating the database into chunks and building an indexing tree around them, so that IP lookups only have to read the parts of the database that are needed for the query at hand. This results in the first query taking around 9ms and subsequent ones that hit the disk cache taking 0.7 ms, while memory consumption is kept at around 0.7MB.
Wrapping it up, geoip-lite has huge overhead costs but sub-millisecond queries whereas this library doesn't have any overhead costs but its queries are slower (0.7-9 ms).
As geoip would be called for every visitor, I assume it would have to read the whole database on each initialization and thereby making fast-geoip the best choice?
or is there some built in mechanism, that makes sure it is accessed from memory across the subsequent requests, when frequently loaded and hence making node-geoip the best choice?
or am I focused on solving my problem the wrong way, and should rather see if there is some way I can get the location via the users browser?
Would appreciate any feedback, even if there is a completely different path worth exploring:-)
I read the documentation for fast-geoip. It's designed for "serverless" cloud services such as AWS Lambda, GCP Cloud Functions, CF Workers where RAM is limited and expensive.
Note the package author's emphasis on low steady-state RAM use in the graphs below.
In summary, assuming a cloud VM/bare-metal deployment and the need to call the IP to location method on every page request, there is probably no compelling reason to use the above package.
PS: Check if the above packages require you to rotate a DB file on disk every few weeks (or rebuild+redeploy your Node app) to keep data up to date. There are commercial REST APIs such as the one in my bio (I am the developer) that may mitigate this hassle, YMMV.
recently started to develop with node and ran into a problem. I have a web service which is a raw bank. basically a collection of raw files (photography stuff). Users just upload them and download. nothing fancy. But recently i came up with an idea to add sorting feature which depends on camera settings: shutter speed, geolocation, fstop, colors and etc. basically upon uploading a raw file I need to process it and this is very heavy files, roughly 60-150 MB each and usually user uploads 3-4 files. what would be the best solution to process heavy files without actually harming server performance.
If you are doing the raw calculations yourself you could look into GPU accelerating them. The best library currently out there for that is https://gpu.rocks/. If you haven't already also make your server work asynchronously and even try making the it with node's cluster feature (the closest you can get to multi-threading in js).
There's a number of things to consider here:
Do you have a good work-queue solution to prioritize jobs and fan them out across multiple worker processes?
Do you make use of things like WebWorkers to make each process much more productive on multi-core systems?
Do you use compiled libraries to help process the images faster? As KolCrooks says, GPU-accelerated libraries are a huge asset as they can cut processing time down from minutes to fractions of a second. This is only relevant if your server has adequate GPU resources, "built-in" GPUs rarely suffice.
How are you storing and exchanging these images? What network topology can you use? 10Gbit vs. 1GBit could make a huge difference here.
I am currently working on a web-based MMORPG game and would like to setup an auto-scaling strategy based on Docker and DigitalOcean droplets.
However, I am wondering how I could manage to do so:
My game server would have to be splittable across different Docker containers BUT every game server instance should act as if it was only one gigantic game server. That means that every modification happening in one (character moving) should also be mirrored in every other game server.
I am trying to get this to work (at least conceptually) but can't find a way to synchronize all my instances properly. Should I use a master only broadcasting events or is there an alternative?
I was wondering the same thing about my MySQL database: since every game server would have to read/write from/to the db, how would I make it scale properly as the game gets bigger and bigger? The best solution I could think of was to keep the database on a single server which would be very powerful.
I understand that this could be easy if all game servers didn't have to "share" their state but this is primarily thought so that I can scale quickly in case of a sudden spike of activity.
(There will be different "global" game servers like A, B, C... but each of those global game servers should be, behind the scenes, composed of 1-X docker containers running the "real" game server so that the "global" game server is only a concept)
The problem you state is too generic and it's difficult to give a concrete response. However let me be reckless and give you some general-purpose scaling advices:
Remove counters from databases. Instead primary keys that are auto-incremented IDs, try to assign random UUIDs.
Change data that must be validated against a central point by data that is self contained. For example, for authentication, instead of having the User Credentials in a DB, use JSON Web Tokens that can be verified by any host.
Use techniques such as Consistent Hashing to balance the load without need of load balancers. Of course use hashing functions that distribute well, to avoid/minimize collisions.
The above advices are basically about changing the design to migrate from stateful to stateless in as much as aspects as you can. If you anyway need to provide stateful parts, try to guess which entities will have more chance to share stateful data and allocate them in the same (or nearly server). For example, if there are cities in your game, try to allocate in the same server the users that are in the same city, since they are more willing to interact between them (and share stateful data) than users that are in different cities.
Of course if the city is too big and it's very crowded, you will probably need to partition the city in more servers to avoid overloading the server.
Your question is too broad and a general scaling problem as others have mentioned. It'd have been helpful if you'd stated more clearly what your system requirements are.
If it has to be real-time, then you can choose Redis as your main DB but then you'd need slaves (for replication) and you would not be able to scale automatically as you go*, since Redis doesn't support that. I assume that's not a good option when you're working with games (Sudden spikes are probable)
*there seems to be some managed solutions, you need to check them out
If it can be near real-time, using Apache Kafka can prove to be useful.
There's also a highly scalable DB which has everything you need called CockroachDB (I'm a contributor, yay!) but you need to run tests to see if it meets your latency requirements.
Overall, going with a very powerful server is a bad choice, since there's a ceiling and it'd cost you more to scale vertically.
There's a great benefit in scaling horizontally such an application. I'll try to write down some ideas.
Option 1 (stateful):
When planning stateful applications you need to take care about synchronisation of the state (via PubSub, Network Broadcasting or something else) and be aware that every synchronisation will take time to occur (when not blocking each operation). If this is ok for you, lets go ahead.
Let's say you have 80k operations per second on your whole cluster. That means that every process need to synchronise 80k state changes per second. This will be your bottleneck. Handling 80k changes per second is quiet a big challenge for a Node.js application (because it's single threaded and therefore blocking).
At the end you'll need to provision precisely the maximum amount of changes you want to be able to sync and perform some tests with different programming languages. The overhead of synchronising needs to be added to the general work load of the application. It could be beneficial to use some multithreaded language like C, Java/Scala or Go.
Option 2 (stateful with routing):*
In some cases it's feasible to implement a different kind of scaling.
When for example your application can be broken down into areas of a map, you could start with one app replication which holds the full map and when it scales up, it shares the map in a proportional way.
You'll need to implement some routing between the application servers, for example to change the state in city A of world B => call server xyz. This could be done automatically but downscaling will be a challenge.
This solution requires more care and knowledge about the application and is not as fault tolerant as option 1 but it could scale endlessly.
Option 3 (stateless):
Move the state to some other application and solve the problem elsewhere (like Redis, Etcd, ...)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'd need to build a simple analytics back-end for capturing user behaviour. This will be captured via a Javascript snippet on a webpage just like Google Analytics or Mixpanel data.
The system needs to capture close-to-realtime browser data (scrolling position of page, mouse position etc.) It will record the state of the users' page every 5 seconds. There are only three attributes on each measurement but they are have to be taken frequently.
The data doesn't necessarily need to be sent every 5 seconds, it could be bussed up less frequently however it's imperative that I get all of the data while the user is on the page. i.e. I can't bus it once per minute and lose the last 59 seconds of data for someone who leaves after 119 seconds.
If possible I'd like to build a system that will scale for the foreseeable future which means it working for 10,000 sites, each with 100 concurrent visitors, i.e. 100,000 concurrent users each sending one event every 5 seconds.
I'm not worried about querying the data, that can be done using a separate system. I'm most interested in how to handle the capture of the data itself.
Requirements
Based on the budgeting above, the system needs to handle 20,000 events per second coming from a pool of 100,000 users.
I'd like to host this service on Heroku however while I've done a lot of work with Rails, I'm completely new to high throughput systems (other than knowing you don't process them using Rails).
Questions
Is there a commercial system that would be good for doing this (like Pusher but for data capture as well as distribution)?
Should I be looking to do this using HTTP requests or websockets?
Is node.js the right choice for this or just trendy?
If I were to chose a socket-based solution, how many sockets can a dyno on Heroku handle for each webserver
What are the pertinent considerations for choosing between Mongo / Reddis etc. for storage
Is this the type of problem which actually requires two solutions - the first to get you to reasonable scale quickly and inexpensively and the second to take you past that scale on lower incremental cost but with more development effort required upfront?
My high level comment for you is to build your system following the 12 factor design, and then worry about scaling as the customers arrive. I'm thrilled with Node.js and the npm ecosystem, but I also think you could build a perfectly acceptable platform with Rails. If it took 3 dynos to support 100 K concurrent users with Node, and double that with Rails, you still might be better off with Rails, if your comfort with Ruby got you to market 3 months faster. Anyway, assuming you go with Node, here are my answers:
Here are some alternatives to Pusher that might work for you and a discussion of Pusher vs. Pubnub. Also see Ably.
Use socket.io. It's largely the standard, because it uses the best transport available and falls back from WebSockets to HTTP methods.
Node is a fantastic choice and is also trendy (see the module growth rate). I suspect you could make your system work fine in Node, Rails or several other frameworks.
A Heroku dyno should be able to support tens of thousands of concurrent connections, depending on how efficient you are with RAM. A server with 16 GB of RAM was able to support a million concurrent connections. Assuming you're RAM-limited, a Heroku dyno with 512 MB of RAM should be able to support ~30 K connections.
You likely want to pick two different systems, one for storage and processing of your data, and one for caching. Here's a great post about picking your core data platform from the creator of Instagram. For core data, I recommend Postgres (on Heroku) using the Sequelize ORM. But, Mongo with SOLR for search would probably work fine too. Note that Postgres 9.2 can be used as a NoSQL datastore if that's the way you want to go. For a caching system I highly recommend Redis.
No, I would try to avoid throw away engineering. Instead, build something that works, and expect that everytime you reach an order of magnitude more traffic, some piece of the system will break and need to be replaced. But, if you follow the 12 Factor principles, you should be in good shape to scale horizontally while you're investing in the replacement.
Good luck.
There are many services for sockets, but Pusher and Pubnub seem to be the market leaders in this space. What ever you do, don't host your own like socket.io because heroku times out requests longer than 30 seconds, including websockets. So a hosted socket would definitely be out of the question unless you plan on closing and re-opening the socket every few seconds.
If you were to use a socket service like Pusher, then you will need to implement a http endpoint for the service to send you the data anyway. So I would just cut the middle man out and go with a direct http request. Granted you need to collect constant user interactions, but that can all be recorded on the JavaScript client and sent back to the app periodically through CORS XHR or a tracking image.
node is a great choice, it's light, pretty easy to set up and the npm libraries available will have everything you need to get you started. Rails can be pretty swift too, especially if you cut out the things you don't need. There is a great railscast on this subject. The important thing is to keep it as simple as possible. Maybe split it into two applications; one for collecting data, the other for analysing/process it. This way you could collect the data in node cause it's fast and analyse/process it in rails cause it's easy.
As I mentioned in 1. sockets just aren't going to work in heroku and even if you used pusher you're still going to have to support the same number of http requests because when pusher receives the data it's going to send it straight on to you. As for how many dynos will you need, this will be something that will be easily tested but not something I can estimate. It will depend entirely on the efficiency of the code collecting the data. A simple Apache AB test with the load and concurrency you are expecting will give you a good indication of what you will need. Node comes with it's own concurrency but if you were to use rails to collect the data then use unicorn or puma as your server because they support concurrency. Also try different configurations when Apache AB testing; heroku now provide 2x dynos which are 1024mb instead of 512 which will allow you more concurrency
This stackoverflow thread suggests redis is faster and faster is what you're going to want for collecting the data. Though after collecting it, you'll probably want to process it and store it in more than a key, value store. Mongo is a good option for that but i would go with a graph database like neo4j because of the intricate connections analytics have.
If your entering new ground here, then you are not going to get it right first time, you will find yourself iterating over it to get the best performance and the most accurate data. Eventually you'll probably delete it and start again with a new architecture and the cycle will continue. Keeping the data collection and the analysis separate means you can focus on getting each bit right separately.
A few addional points I would like to mention is use a CDN for distribution of the JavaScript client, or better yet, provide the full JS to serve from the page. Either way, load fast and load asynchronously. It sounds like a fun project. Good luck!
EDIT In an alternate universe, where you do not have to use heroku, websockets would be an awesome solution.
I'm a developer of a MMO game and currently we're at my company facing some scalability issues which, I think, can be resolved with proper clustering of the game world.
I don't really want to reinvent the wheel that's why I think Linux Virtual Server could be a good choice especially with some Level 7 load balancing technique.
I'm currently looking at ktcpvs as a load balancing solution and wonder if it's a proper choice.
The main idea is to have a number of zones("locations" in terms of my game) running on dedicated servers. When a player decides to go to some specific location the load balancer decides which zone server will be actually serving the player(that's actually why I need a Level 7 load balancer)
What do you folks think about all said above?
Update: I posted the same question to LVS users mailing list http://marc.info/?l=linux-virtual-server&m=124976265209769&w=2
Update: I also started the similar topic on the gamedev.net forum http://www.gamedev.net/community/forums/topic.asp?topic_id=544386
In order to address your question we need to understand whether you need volume or response, but it is difficult to get both at the same time.
Layer 7 load balancing - is data based application level balancing, so the data content of the network packet needs to be routed to an end-point. You can achieve volume (more users) by implementing routing at the application level, service level or kernel level.
Scalability - I assume you are running out of memory, CPU resources and network bandwidth.
Application level - your application logic receives an application packet and routes accordingly.
Service level - your system framework (front end service of some kind) receives the packet and through a module - performs the routing (think of custom apache module, even network driver modules - like writing a network filter)
Kernel level - Performs routing at network packet level.
The closer you move to the metal, the better your response will be. I suggest using dedicated linux server up-front to perform the routing - go native, not virtual. Use multiple or teamed network adapters for the WAN and a dedicated adapter for each end-point (one+ wan, one each for each connected app server)
If response time is important then you need a kernel/supervisor state solution, it will save you a few context switches but be aware that you need to limit hops at all costs and could better be served by fewer, larger machines and your scalability will always be limited. There is a risk in using KTCPVS, it is quite old and not actively updated. If you judge that it works for you great, otherwise consider writing something akin to a network filter as long as it runs in system state.
If volume is important but response time is secondary, implement a custom built high-speed socket switch built in C++ running in problem/user state. It is the easiest to maintain and will offer the best scalability.
You will need to build some prototypes to figure out what suits your needs best.
Final thoughts -
Before doing any of the above first ensure that you have optimized your game design. You may know most of this, I list it here for the benefit of all.
(a) messages should fit comfortably within one network packet, less than 1500 bytes for most home routers
(b) Try to fit the logic of the routing in your game client instead of your servers. A simple download of a small table with zones and IP addresses to a client will allow you to forego all of the above.
(c) Try to limit zone visibility by to the clients, they should know about their zones and adjacent zones only (if you implement the point b above)
Hope this helps, sorry I cannot be more specific regarding KTCPVS.
You haven't specified where the bottleneck is. Network Traffic? Disk IO? CPU Cycles?
Assuming you mean a layer 7 load balancer and don't have enough CPU power, I think LVS ist not the optimal choice. I have done Web Server load balancing with LVS, which works straightforward and isn't exactly complicated.
But I think load balancing an MMORP this way needs considerable amounts of additional code in LVS, it might be easier to do the load balancing with a multithreaded application distributed over some multicore server. But this isn't fully scalable, this only gets you to 16 cores without prohibitve cost increase.
The biggest issue in something like this is what happens when players are near a boundary. Obviously they need to be able to see and interact with each other, but they're on separate servers. So you need some pretty fancy inter-server communication, sometimes just duplicating messages to both servers. It can get even more complicated when someone is near a "corner", and then you have to deal with 4 servers!
The book Massively Multiplayer Game Development has a chapter on "The Pitfalls of Shared Server Boundaries" which covers this issue in detail.
I haven't heard of Linux Virtual Server before now, so I don't understand how it fits. I think your actual server application needs to support this game-specific load balancing, rather than trying to run a cluster and assuming that it will automatically know how to split up your application (which it won't). If I were you, I would write the server program to handle its own piece of land, and it should connect to the pieces of land around it, and then design a server-to-server protocol for the passing of these messages ("here comes a player, I'm going to start telling you about him!" "make sure to tell me about messages near our boundary", "okay the player is out of my territory and into yours, here's his detailed data", etc). I think it's a bit more complicated than just running a different flavor of Linux and assuming you'll get automatic load balancing.
Why are you moving the distribution logic to the loadbalancer? It's a component that's not free and can break. It seems your clients are quite aware of which zone they're in. It seems they could very well connect to zone<n>.example.com. You'd then handle loadbalancing at DNS level.