Server architecture for a scalable web application

Server architecture for a scalable web application - node.js

we're planing to deploy a web-application with Amazon OpsWork and I just wanted to check with you, if our architecture might have any design flaws.
We've 4 components:
A load balanacer (Amazon preferably)
Express based on Node.js
MongoDB
ElasticSearch
Here's a communication diagram of our components:
At the front is a load balancer which distributes http requests to multiple web servers.
The web server is stateless and therefore can be cloned each time the load requires it. All web server instances are equal. Session information is saved in the MongoDB.
In the "backend" we're planing to use the build-in cluster functionalities from MongoDB and ElasticSearch. Therefore each web server instance only connects to a single MongoDB and ElasticSearch master instance. MongoDB and ElasticSearch are then scaling accordingly. Furthermore the the ElasticSearch master speaks to the MongoDB master to retrieve data for building the index.
How we see it, the most challenging task to setup such a system, is to configure OpsWorks with the MongoDB and ElasticSearch cluster.
Many thanks in advance!

if our architecture might have any design flaws.
Well, keep in mind that we can't tell much from a generic diagram. But here are some notes:
1) MongoDB isn't as easy to scale as other databases such as DynamoDB, Riak or Cassandra. For example, if you ever exceed the capacity of a single master (no matter how many slaves you have, all writes go to the single master), you'll have to shard. But switching to sharding is very disruptive and very tedious to set up.
If you don't expect to exceed the write capacity of one node, then you'll be fine on MongoDB.
2) What will you do for async tasks such as sending emails, creating long reports, etc?
It's possible to do these things in the request loop, and that's probably a fine way to get started. But as you have more boxes, the chances of failure go up. When a box dies, all the async tasks go away and nobody will know what they were. You also can have problems where one box gets heavily loaded with async tasks (using too much CPU or memory), and the problem will get worse and worse as it gets more tasks and completes them more slowly.
Also, a front-end like ELB will have a 60-second limit, which can cause problems if some of your requests could take longer. (Spin them off into async jobs with polling or something.)
3) ELB doesn't support web sockets. Consider that if you think you might want websockets down the road.

There's no such thing as a master in elastic search. You have master copies of shards and replicas of shards but they are basically moved around through your cluster by elastic search. Nodes might be master for one shard and a replica for another. So, you could simply put a load balancer in front of it.
However, you can specialize nodes to be data nodes or routing nodes as explained here: http://www.elasticsearch.org/guide/reference/modules/node/
The routing nodes effectively become load balancers. You could have a few of those (redundancy) and distribute load between those. Alternatively, you could run a dedicated router node on each web server. Basically routing nodes are pretty light and you save a bit of bandwidth/latency since your web server talks to localhost and from there it is all elastic search internal cluster traffic.

I'd recommend to replace MongoDB with Amazon Dynamo DB (it has node.js SDK).

Related

Which MongoDB scaling strategy (Sharding, Replication) is suitable for concurrent connections?

Consider scenario that
I have multiple devclouds (remote workplace for developers), they are all virtual machines running on the same bare-metal server.
In the past, they used their own MongoDB containers running on Docker. So that number of MongoDB containers can add up to over 50 instances across devclouds.
The problem becomes apparent that while 50 instances is running at the same time, but only 5 people actually perform read/write operations against their own instances. So other 45 running instances waste the server's resources.
Should I use only one MongoDB cluster by combining a set of MongoDB instances ,for everyone so that they can connect to 1 endpoint only (via internal network) to avoid wasting resources.
I am considering the sharding strategy, but the problem is there are chances that if one node taken down (one VM shut down), is that ok for availability (redundancy)?
I am pretty new to sharding and replication, looking forward to know your solutions. Thank you

If each developer expects to have full control over their database deployment, you can't combine the deployments. Otherwise one developer can delete all data in the deployment, etc.
If each developer expects to have access to one database, you can deploy a single replica set serving all developers and assign one database per developer (via authentication).
Sharding in MongoDB sense (a sharded cluster) is not really going to help in this scenario since an application generally uses all of the shards. You can of course "shard manually" by setting up multiple replica sets.

Is implementing elastic search service on same server as node server with auto scaling is a good idea?

Trying to deploy a project on t3 large server with auto scaling.
I have my elastic search service deployed on same system as node and react projects.(Not using AWS elastic search)
Will it be facing issues in future and i need to segregate elastic search service to some other server?

It's always nice to have a separate dedicated server for running the Elasticsearch server but as you are using AWS some of the things which you can do to minimize the issues:
Elasticsearch is a stateful application contrast to your node and react app unless you are storing the state there as well which is not a good idea and due to stateless nature of the applications, autoscaling is very useful as you can on-demand based on the CPU, memory or other metrics scale up or down the instances.
But in case of Elasticsearch or other stateful applications, it becomes tricky as when you scale up or down the instance, shards get relocated if they are not reachable within a threshold which can lead to unbalanced Elasticsearech cluster.
Now in order to minimize these issues:
Make sure you can storing Elasticsearch indices on the network-attached disk so that there is no data loss when autoscaling brings a new instance and new instance again should use earlier network attaches EBS(where your data is stored).
Make sure you don't create a new Elasticsearch process when you scale up or down the instances according to your autoscaling policy and the Elasticsearch process should be fixed and scale up/down with some manual intervention.
If you have to scale up the Elasticsearch cluster then make sure you disable shard allocation to avoid the issues mentioned earlier.
These are some known issues which you might face and there could be even more based on your configuration and while writing the answer itself I felt, it so easy to just have a dedicated instance for Elasticsearch to avoid these weird issues.

I would add to other answers following:
Elasticsearch performs best if it has enough RAM to keep indexes in entirety in RAM. If the Elasticsearch is competing with Node/Application for RAM it will affect it's performance.
From maintenance/performance perspective you should consider having at least 3-node cluster. Even if that means you have smaller machines. If AWS is upgrading infrastructure and you have 1 machine, when than 0.05% unavailability hits your search is down. If you need to do maintenance on the node or do upgrades having multiple machines will help with availability.
Depending on your use of Elasticsearch and how often you update/delete items in the indexes, and how fast your indexes will grow, adding more machines/nodes to the cluster will help with growth.
There are probably many more things to consider, but that totally depends on your application, budget, SLAs etc.

Deploying Node + MongoDB API on AWS or GCP

I am working on a Node + MongoDB API and the API is deployed on a VM on Google Cloud Platform. Currently, the data is stored in a MongoDB instance running on the VM.
Is running a local MongoDB instance for production a good practice? And how does various cloud services providing MongoDB compare with each other? And what are some good practices to ensure the API is scalable? Also can deploying API to Kubernetes as a container offer better results as compared to VMs?

Having both server and database running in the same VM instance would always be a bad idea for a service that you want to scale. Let's think what happen if your instance could not handle the number of traffic being sent into anymore?
At first you might just do the vertical scale adding more CPU and RAM to the currently running instance and yes, that would be fine for some period of time. But it is actually more like a temporary life-extending plan since at some point, your instance will not be able to do the vertical scale anymore. And here come the horizontal scale...
New instance has to be created to handle the increasing number of traffic and how would you design the new instance for that?
The best practice is to have an instance that looks exactly the same as the first one. So the server is needed to be designed as a stateless one. In this case, your instance will be able to scale for at any number of machine.
That's why you should not have the API front end and database running in the same instance. It could not be scaled.
So how's about the database? It is quite straightforward. Just simply place it in the another VM instance! If database is still be able to handle the request, you can just share the database server among the server instances. But if there is a bottleneck on the database side, you can scale your mongoDB separately with cluster or so.
If you don't have time, you can consider using the ready-to-use service like Heroku Add-On, for example, mongolab, to managing the MongoDB instance for you.
Cheers.

How to make a distributed node.js application?

Creating a node.js application is simple enough.
var app = require('express')();
app.get('/',function(req,res){
res.send("Hello world!");
});
But suppose people became obsessed with your Hello World! application and exhausted your resources. How could this example be scaled up on practice? I don't understand it, because yes, you could open several node.js instance in different computers - but when someone access http://your_site.com/ it aims directly that specific machine, that specific port, that specific node process. So how?

There are many many ways to deal with this, but it boils down to 2 things:
being able to use more cores per server
being able to scale beyond more than one server.
node-cluster
For the first option, you can user node-cluster or the same solution as for the seconde option. node-cluster (http://nodejs.org/api/cluster.html) essentially is a built in way to fork the node process into one master and multiple workers. Typically, you'd want 1 master and n-1 to n workers (n being your number of available cores).
load balancers
The second option is to use a load balancer that distributes the requests amongst multiple workers (on the same server, or across servers).
Here you have multiple options as well. Here are a few:
a node based option: Load balancing with node.js using http-proxy
nginx: Node.js + Nginx - What now? (using more than one upstream server)
apache: (no clearly helpful link I could use, but a valid option)
One more thing, once you start having multiple processes serving requests, you can no longer use memory to store state, you need an additional service to store shared states, Redis (http://redis.io) is a popular choice, but by no means the only one.
If you use services such as cloudfoundry, heroku, and others, they set it up for you so you only have to worry about your app's logic (and using a service to deal with shared state)

I've been working with node for quite some time but recently got the opportunity to try scaling my node apps and have been researching on the same topic for some time now and have come across following pre-requisites for scaling:
My app needs to be available on a distributed system each running multiple instances of node
Each system should have a load balancer that helps distribute traffic across the node instances.
There should be a master load balancer that should distribute traffic across the node instances on distributed systems.
The master balancer should always be running OR should have a dependable restart mechanism to keep the app stable.
For the above requisites I've come across the following:
Use modules like cluster to start multiple instances of node in a system.
Use nginx always. It's one of the most simplest mechanism for creating a load balancer i've came across so far
Use HAProxy to act as a master load balancer. A few pointers on how to use it and keep it forever running.
Useful resources:
Horizontal scaling node.js and websockets.
Using cluster to take advantages of multiple cores.
I'll keep updating this answer as I progress.

The basic way to use multiple machines is to put them behind a load balancer, and point all your traffic to the load balancer. That way, someone going to http://my_domain.com, and it will point at the load balancer machine. The sole purpose (for this example anyways; in theory more could be done) of the load balancer is to delegate the traffic to a given machine running your application. This means that you can have x number of machines running your application, however an external machine (in this case a browser) can go to the load balancer address and get to one of them. The client doesn't (and doesn't have to) know what machine is actually handling its request. If you are using AWS, it's pretty easy to set up and manage this. Note that Pascal's answer has more detail about your options here.
With Node specifically, you may want to look at the Node Cluster module. I don't really have alot of experience with this module, however it should allow you to spawn multiple process of your application on one machine all sharing the same port. Also node that it's still experimental and I'm not sure how reliably it will be.

I'd recommend to take a look to http://senecajs.org, a microservices toolkit for Node.js. That is a good start point for beginners and to start thinking in "services" instead of monolitic applications.
Having said that, building distributed applcations is hard, take time to learn, take LOT of time to master it, and usually you will face a lot trade-off between performance, reliability, manteinance, etc.

Scaling Node.JS across multiple cores / servers

Ok so I have an idea I want to peruse but before I do I need to understand a few things fully.
Firstly the way I think im going to go ahead with this system is to have 3 Server which are described below:
The First Server will be my web Front End, this is the server that will be listening for connection and responding to clients, this server will have 8 cores and 16GB Ram.
The Second Server will be the Database Server, pretty self explanatory really, connect to the host and set / get data.
The Third Server will be my storage server, this will be where downloadable files are stored.
My first questions is:
On my front end server, I have 8 cores, what's the best way to scale node so that the load is distributed across the cores?
My second question is:
Is there a system out there I can drop into my application framework that will allow me to talk to the other cores and pass messages around to save I/O.
and final question:
Is there any system I can use to help move the content from my storage server to the request on the front-end server with as little overhead as possible, speed is a concern here as we would have 500+ clients downloading and uploading concurrently at peak times.
I have finally convinced my employer that node.js is extremely fast and its the latest in programming technology, and we should invest in a platform for our Intranet system, but he has requested detailed documentation on how this could be scaled across the current hardware we have available.

On my front end server, I have 8
cores, what's the best way to scale
node so that the load is distributed
across the cores?
Try to look at node.js cluster module which is a multi-core server manager.

Firstly, I wouldn't describe the setup you propose as 'scaling', it's more like 'spreading'. You only have one app server serving the requests. If you add more app servers in the future, then you will have a scaling problem then.
I understand that node.js is single-threaded, which implies that it can only use a single core. Not my area of expertise on how to/if you can scale it, will leave that part to someone else.
I would suggest NFS mounting a directory on the storage server to the app server. NFS has relatively low overhead. Then you can access the files as if they were local.

Concerning your first question: use cluster (we already use it in a production system, works like a charm).
When it comes to worker messaging, i cannot really help you out. But your best bet is cluster too. Maybe there will be some functionality that provides "inter-core" messaging accross all cluster workers in the future (don't know the roadmap of cluster, but it seems like an idea).
For your third requirement, i'd use a low-overhead protocol like NFS or (if you can go really crazy when it comes to infrastructure) a high-speed SAN backend.
Another advice: use MongoDB as your database backend. You can start with low-end hardware and scale up your database instance with ease using MongoDB's sharding/replication set features (if that is some kind of requirement).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string