I've created a Node.js (Meteor) application and I'm looking at strategies to handle scaling in the future. I've designed my application as a set of microservices, and I'm now considering implementing this in production.
What I'd like to do however is have many microservices running on one server instance to maximise resource usage whilst they are using a small number of resources. I know containers are useful for this, but I'm curious if there's a way to create a dynamically scaling set of containers where I can:
Write commands such as "provision another app container on this server if the containers running this app reach > 80% CPU/other limiting metrics",
Provision and prepare other servers if needed for extra containers,
Load balance connections between these containers (and does this affect server load balancing, e.g., send less connections to servers with fewer containers?)
I've looked into AWS EC2, Docker Compose and nginx, but I'm uncertain if I'm going in the right direction.
Investigate Kubernetes and/or Mesos, and you'll never look back. They're tailor-made for what you're looking to do. The two components you should focus on are:
Service Discovery: This allows inter-dependent services (micro-service "A" calls "B") to "find" each other. It's typically done using DNS, but with registration features on top of it that handle what happens as instances are scaled.
Scheduling: In Docker-land, scheduling isn't about CRON jobs, it means how containers are scaled and "packed" into servers in various ways to maximize efficient usage of available resources.
There are actually dozens of options here: Docker Swarm, Rancher, etc. are also competing alternatives. Many cloud vendors like Amazon also offer dedicated services (such as ECS) with these features. But Kubernetes and Mesos are emerging as standard choices, so you'd be in good company if you at least start there.
Metrics could be collected via Docker API ( and cool blog post ) and it's often used for that.
Tinkering with DAPI and docker stack tools (compose/swarm/machine) could provide alot of tools to scale microservice architecture efficiently.
I could advise in favor of Consul to manage discovery in such resource-aware system.
We are using AWS to host our miroservices application, and using ECS (AWS docker service) to containerize the different API.
And in this context, we use AWS auto scaling feature to manage the scale in and scale out. Check this.
Hope it helps.
Related
If I use different ports for different services and share the same database then can I call it microservice architecture? Like running orders services in port 8080 and running checkout service in 8081 port.
Unfortunately no, it's far more complex and I suggest investing some hours in reading on internet about it from people who will be able to explain it a lot better than me, but long story short at least you should have:
independent applications, each one with its own database. Of course each app on different ports
you most likely want to containerize them with docker and I personally suggest orchestrating the containers with kubernetes and in this case you will need a deployment file and a clusterIP service for each application ( read more into 'pods' , 'deployments' etc on kubernetes )
the independent services should not reach each other, in stead you use what is called event bus. You'll want to read more about that
You can sync each serice's database through the event bus. You'll duplicate some data but this is an acceptable thing because storage is cheap and service independence is far more important.
You'll need a Load Balancer to access your app eventually of course
At least this is what came in my mind in the first place. You can use it as a starting point and look into this more but as a heads up, it should not be easy. Data consistency throughout the services usually is hell.
They both solve the same issue - scalability. When to use which?
And is there a point to integrating cluster API for node app running inside a docker container?
They're not really equivalent. Microservices solve an organizational and code management problem, scalability in a very dynamic way, reducing tight coupling, and keeping bugs isolated to one microservice). cluster solves scalability in a very limited way, by spinning out cluster workers on the same machine. If you have one large app and generally scale vertically (by increasing the amount of computing power your hosts have), cluster is great. If not, breaking things down int services (or further down into microservices) is also great.
You can also do both (your second question), for example running Node apps in containers on Kubernetes, where the Node apps use cluster. Depending on how your containers get run and how many vCPUs they're allocated, it may or may not have any effect, but it's only a couple lines of code so it doesn't hurt to add it.
Can i know what the best practice for running applications on a service fabric node, Is it one service per node or multiple services per node? Let me know any advantages or disadvantages if there are any.
One of the goals/features of using ASF is to run many (micro)services on your nodes, to make efficient use of all the system resources.
Deploy applications at higher density than virtual machines, deploying
hundreds or thousands of applications per machine.
Read more here.
With the advent of docker and scheduling & orchestration services like Amazon's ECS, I'm trying to determine the optimal way to deploy my Node API. With Docker and ECS aside, I've wanted to take advantage of the Node cluster library to gracefully handle crashing the node app in the event of an asynchronous error as suggested in the documentation, by creating a master process and multiple worker processors.
One of the benefits of the cluster approach, besides gracefully handling errors, is creating a worker processor for each available CPU. But does this make sense in the docker world? Would it make sense to have multiple node processes running in a single docker container that was going to be scaled into a cluster of EC2 instances on ECS?
Without the Node cluster approach, I'd lose the ability to gracefully handle errors and so I think that at a minimum, I should run a master and one worker processes per docker container. I'm still confused as to how many CPUs to define in the Task Definition for ECS. The ECS documentation says something about each container instance having 1024 units per CPU; but that isn't the same thing as EC2 compute units, is it? And with that said, I'd need to pick EC2 instance types with the appropriate amount of vCPUs to achieve this right?
I understand that achieving the most optimal configuration may require some level of benchmarking my specific Node API application, but it would be awesome to have a better idea of where to start. Maybe there is some studying/research I need to do? Any pointers to guide me on the path or recommendations would be most appreciated!
Edit: To recap my specific questions:
Does it make sense to run a master/worker cluster as described here inside a docker container to achieve graceful crashing?
Would it make sense to use nearly identical code as described in the Cluster docs, to 'scale' to available CPUs via require('os').cpus().length?
What does Amazon mean in the documentation for ECS Task Definitions, where it says for the cpus setting, that a container instance has 1024 units per CPU? And what would be a good starting point for the this setting?
What would be a good starting point for the instance type to use for an ECS cluster aimed at serving a Node API based on the above? And how do the available vCPUs affect the previous questions?
All these technologies are new and best practices are still being established, so consider these to be tips from my experience only.
One-process-per-container is more of a suggestion than a hard and fast rule. It's fine to run multiple processes in a container when you have a use for it, especially in this case where a master process forks workers. Just use a single container and allow it to fork one process per core, as you've suggested in the question.
On EC2, instance types have a number of vCPUs, which will appear as a core to the OS. For the ECS cluster use an EC2 instance type such as the c3.xlarge with four vCPUs. In ECS this translates to 4096 CPU units. If you want the app to make use of all 4 vCPUs, create a task definition that requires 4096 cpu units.
But if you're doing all this only to stop the app from crashing you could also just use a restart policy to restart the container if it crashes. It appears that restart policies are not yet supported by ECS though.
That seems like a really good pattern. It's similar to what is done with Erlang/OTP, and I don't think anyone would argue that it's one of the most robust systems on the planet. Now the question is how to implement.
I would leverage patterns from Heroku or other similar PaaS systems that have a little bit more maturity. I'm not saying that amazon is the wrong place to do this, but simply that a lot of work has been done with this in other areas that you can translate. For instance, this article has a recipe in it:
https://devcenter.heroku.com/articles/node-cluster
As far as the relationships between vCPU and Compute Units, it looks like it's just a straight ratio of 1/1024. It is a move toward microcharges based on CPU utilization. They are taking these even farther with the lambda work. They are charging you based on fractions of a second that you utilize.
In the docker world you would run 1 nodejs per docker container but you would run many such containers on each of your ec2 instances. If you use something like fig you can use fig scale <n> to run many redundant containers an an instance. This way you don't have to have to define your nodejs count ahead of time and each of your nodejs processes is isolated from the others.
Creating a node.js application is simple enough.
var app = require('express')();
app.get('/',function(req,res){
res.send("Hello world!");
});
But suppose people became obsessed with your Hello World! application and exhausted your resources. How could this example be scaled up on practice? I don't understand it, because yes, you could open several node.js instance in different computers - but when someone access http://your_site.com/ it aims directly that specific machine, that specific port, that specific node process. So how?
There are many many ways to deal with this, but it boils down to 2 things:
being able to use more cores per server
being able to scale beyond more than one server.
node-cluster
For the first option, you can user node-cluster or the same solution as for the seconde option. node-cluster (http://nodejs.org/api/cluster.html) essentially is a built in way to fork the node process into one master and multiple workers. Typically, you'd want 1 master and n-1 to n workers (n being your number of available cores).
load balancers
The second option is to use a load balancer that distributes the requests amongst multiple workers (on the same server, or across servers).
Here you have multiple options as well. Here are a few:
a node based option: Load balancing with node.js using http-proxy
nginx: Node.js + Nginx - What now? (using more than one upstream server)
apache: (no clearly helpful link I could use, but a valid option)
One more thing, once you start having multiple processes serving requests, you can no longer use memory to store state, you need an additional service to store shared states, Redis (http://redis.io) is a popular choice, but by no means the only one.
If you use services such as cloudfoundry, heroku, and others, they set it up for you so you only have to worry about your app's logic (and using a service to deal with shared state)
I've been working with node for quite some time but recently got the opportunity to try scaling my node apps and have been researching on the same topic for some time now and have come across following pre-requisites for scaling:
My app needs to be available on a distributed system each running multiple instances of node
Each system should have a load balancer that helps distribute traffic across the node instances.
There should be a master load balancer that should distribute traffic across the node instances on distributed systems.
The master balancer should always be running OR should have a dependable restart mechanism to keep the app stable.
For the above requisites I've come across the following:
Use modules like cluster to start multiple instances of node in a system.
Use nginx always. It's one of the most simplest mechanism for creating a load balancer i've came across so far
Use HAProxy to act as a master load balancer. A few pointers on how to use it and keep it forever running.
Useful resources:
Horizontal scaling node.js and websockets.
Using cluster to take advantages of multiple cores.
I'll keep updating this answer as I progress.
The basic way to use multiple machines is to put them behind a load balancer, and point all your traffic to the load balancer. That way, someone going to http://my_domain.com, and it will point at the load balancer machine. The sole purpose (for this example anyways; in theory more could be done) of the load balancer is to delegate the traffic to a given machine running your application. This means that you can have x number of machines running your application, however an external machine (in this case a browser) can go to the load balancer address and get to one of them. The client doesn't (and doesn't have to) know what machine is actually handling its request. If you are using AWS, it's pretty easy to set up and manage this. Note that Pascal's answer has more detail about your options here.
With Node specifically, you may want to look at the Node Cluster module. I don't really have alot of experience with this module, however it should allow you to spawn multiple process of your application on one machine all sharing the same port. Also node that it's still experimental and I'm not sure how reliably it will be.
I'd recommend to take a look to http://senecajs.org, a microservices toolkit for Node.js. That is a good start point for beginners and to start thinking in "services" instead of monolitic applications.
Having said that, building distributed applcations is hard, take time to learn, take LOT of time to master it, and usually you will face a lot trade-off between performance, reliability, manteinance, etc.