Node.js scaling out on Kubernetes - node.js

I built an app on node.js using Docker and I'm not sure how to scale it on a Kubernetes cluster so that I take the most out of my cluster hardware.
From a performance perspective which of the following is better:
clusterize my node app and run as many containers as needed
or
just run as many containers as needed without clustering ?
When I say clustering I mean this https://nodejs.org/api/cluster.html
My app is a simple CRUD Api backed by mongoDB. We estimate that it will have 1000 concurrent users. Our cluster has 3 nodes.

The NodeJS cluster mechanism is useful to allow NodeJS to more effectively use greater than a single core, so depending on your code it may benefit you, but it's highly dependent on your code and the various dependencies and how well they work (or not) with clustering.
As a general practice, if you can break your containers down into nicely parallelized efforts that can be run as pods within kubernetes, then I'd recommend the following as a process to see what works for you:
set up a single pod with your code in it, and run a load test against it. Use the data that Kubernetes has from cAdvisor to characterize how much resources (cpu & memory) your pod likes to have.
set a resource limit for cpu and memory based on what you see above.
run a load test to validate what your single pod handles in terms of scale
And from there, you have a baseline where you can use Kubernetes to scale this horizontally to validate the 1000 user concurrent baseline you want to achieve.
There's a good talk on this process from the 2017 Kubecon called Load Testing Kubernetes: How to optimize your cluster resource allocation in production
Once you have a baseline, you can run a prototype out leveraging the clustering in your code, and then compare against the non-clustered version. If you do this, I'd double-check that any limits you set are > 1 core for CPU, or you'll be self-limiting outside of the NodeJS runtime to get access to multiple cores, which would defeat the purpose of using clustering.
Depending on what you're doing in your code, there may be significant re-work needed to enable clustering, as it wants to leverage its own worker concept, and it's not clear what frameworks you're using and if they'll fit reasonably into that structure.

Related

Node.js Cluster module vs Microservices

They both solve the same issue - scalability. When to use which?
And is there a point to integrating cluster API for node app running inside a docker container?
They're not really equivalent. Microservices solve an organizational and code management problem, scalability in a very dynamic way, reducing tight coupling, and keeping bugs isolated to one microservice). cluster solves scalability in a very limited way, by spinning out cluster workers on the same machine. If you have one large app and generally scale vertically (by increasing the amount of computing power your hosts have), cluster is great. If not, breaking things down int services (or further down into microservices) is also great.
You can also do both (your second question), for example running Node apps in containers on Kubernetes, where the Node apps use cluster. Depending on how your containers get run and how many vCPUs they're allocated, it may or may not have any effect, but it's only a couple lines of code so it doesn't hurt to add it.

Is implementing elastic search service on same server as node server with auto scaling is a good idea?

Trying to deploy a project on t3 large server with auto scaling.
I have my elastic search service deployed on same system as node and react projects.(Not using AWS elastic search)
Will it be facing issues in future and i need to segregate elastic search service to some other server?
It's always nice to have a separate dedicated server for running the Elasticsearch server but as you are using AWS some of the things which you can do to minimize the issues:
Elasticsearch is a stateful application contrast to your node and react app unless you are storing the state there as well which is not a good idea and due to stateless nature of the applications, autoscaling is very useful as you can on-demand based on the CPU, memory or other metrics scale up or down the instances.
But in case of Elasticsearch or other stateful applications, it becomes tricky as when you scale up or down the instance, shards get relocated if they are not reachable within a threshold which can lead to unbalanced Elasticsearech cluster.
Now in order to minimize these issues:
Make sure you can storing Elasticsearch indices on the network-attached disk so that there is no data loss when autoscaling brings a new instance and new instance again should use earlier network attaches EBS(where your data is stored).
Make sure you don't create a new Elasticsearch process when you scale up or down the instances according to your autoscaling policy and the Elasticsearch process should be fixed and scale up/down with some manual intervention.
If you have to scale up the Elasticsearch cluster then make sure you disable shard allocation to avoid the issues mentioned earlier.
These are some known issues which you might face and there could be even more based on your configuration and while writing the answer itself I felt, it so easy to just have a dedicated instance for Elasticsearch to avoid these weird issues.
I would add to other answers following:
Elasticsearch performs best if it has enough RAM to keep indexes in entirety in RAM. If the Elasticsearch is competing with Node/Application for RAM it will affect it's performance.
From maintenance/performance perspective you should consider having at least 3-node cluster. Even if that means you have smaller machines. If AWS is upgrading infrastructure and you have 1 machine, when than 0.05% unavailability hits your search is down. If you need to do maintenance on the node or do upgrades having multiple machines will help with availability.
Depending on your use of Elasticsearch and how often you update/delete items in the indexes, and how fast your indexes will grow, adding more machines/nodes to the cluster will help with growth.
There are probably many more things to consider, but that totally depends on your application, budget, SLAs etc.

How to use clusters in node js?

I am very new to Node.js and express. I am currently learning it by building my own services.
I recently read about clusters. I understood what clusters do. What I am not able to understand is how to make use of clusters in a production application.
One way I can think of is to use the Master process to just sit in front and route the incoming request to the next available child process in a round robin fashion. I am not sure if this is how it is designed to be used. I would like to know how should clusters be used in a typical web application.
Thanks.
The node.js cluster modules is used with node.js any time you want to spread out the request processing across multiple node.js processes. This is most often used when you wish to increase your ability to handle more requests/second and you have multiple CPU cores in your server. By default, a single instance of node.js will not fully utilize multiple cores because the core Javascript you run in your server is single threaded (uses one core). Node.js itself does use threads for some things internally, but that's still unlikely to fully utilize a mult-core system. Setting up a clustered node.js process for each CPU core will allow you to better maximize the available compute resources.
Clustering also provides you with some additional fault tolerance. If one cluster process goes down, you can still have other live clusters serving requests while the disabled cluster restarts.
The cluster module for node.js has a couple different scheduling algorithms - the round robin you mention is one. You can read more about that here: Cluster Round-Robin Load Balancing.
Because each cluster is a separate process, there is no automatic shared data among the different cluster processes. As such, clustering is simplest to implement either where there is no shared data or where the shared data is already in a place that it can be accessed by multiple processes (such as in a database).
Keep in mind that a single node.js process (if written to properly use async I/O and not heavily compute bound) can server many requests itself at once. Clustering is when you want to expand scalability beyond what one instance can deliver.
I have created a poc on cluster in nodejs and added some details in the below blogs. Once go through it. It may provide some clearance.
https://jksnu.blogspot.com/2022/02/cluster-in-node-js-application.html
https://jksnu.blogspot.com/2022/02/cluster-management-in-node-js.html

Docker containers and Node.js clusters

I have an api server running Node.js that was using it's cluster module and testing looked to be pretty good. Now our IT department wants to move to using Docker containers which I am happy about but I've never actually used it other than just playing around. But I had a thought, the Node.js app runs within a single Docker process so the cluster module wouldn't really be the best as the single Docker process can be a slow point of the setup until the request is split up within that process by the cluster module.
So really a cluster of Docker containers running being able to start and stop them on the fly is more important than using Node.js' cluster module correct?
If I have a cluster of containers, would using Node.js' cluster module get me anything? The api endpoints take less than .5sec to return (usually quite a bit less).
I'm using MySQL (believe it's a single server, nothing more currently) so there shouldn't be any reason to use a data integrity solution then.
What I've seen as the best solution when using Docker is to keep as fewer processes per container as possible since containers are lightweight; you don't want processes trying to use more than one CPU. So, running a cluster in the container won't add any value and might worsen latency.
Here https://medium.com/#CodeAndBiscuits/understanding-nodejs-clustering-in-docker-land-64ce2306afef#.9x6j3b8vw Chad Robinson explains the idea in general terms.
Kubernetes, Rancher, Mesos and other container management layers handle the load-balancing. They provide "scheduling" (moving those Docker container slices around different CPUs and machines to get a good usage across the cluster) and "networking" (load balancing inbound requests to those containers) layers internally.
Update
I think it's worth adding the link Why it is recommended to run only one process in a container? where people share their ideas and experiences, but chiefly from Jon there are some interesting points:
Provided that you give a single responsibility (single process, function or concern) to a container: Good idea Docker names this 'concern' ;)
Scaling containers horizontally is easier.
It can be re-used in different projects.
Identifying issues and troubleshooting is a breeze compared to do it in an entire application environment. Also, logging and reporting can be more accurate and detailed.
Upgrades/Downgrades can be gradually and fully controlled.
Security can be applied to specific resources and at different levels.
You'll have to measure to be sure, but my hunch would be running with node's cluster module would be worthwhile. It would get you more CPU utilization with the least amount of extra overhead. No extra containers to manage (start, stop, monitor). Plus the cluster workers have an efficient communication mechanism. The most reasonable evolution (don't skip steps) would seem to me:
1 container, 1 node process
1 container, several clustered node workers
several containers, each with several node workers
I have a system with 4 logical cores with me and I ran following line on my machine as well as on docker installed on same machine.
const numCPUs = require('os').cpus().length;
console.log(numCPUs)
This lines prints 4 on my machine and 2 inside docker container. Which means if we use clustering in docker container only 2 instance would be running. So docker container doesn't see cores same as actual machine does. Also running 5 docker container with clustering mode enabled gives 10 instance of machine which ultimately be manages by kernel of OS with 4 logical cores.
So I think best approach is to use multiple docker container instance in swarm mode with node.js clustering disabled. This should give the best performance.

What is the optimal way to run a Node API in Docker on Amazon ECS?

With the advent of docker and scheduling & orchestration services like Amazon's ECS, I'm trying to determine the optimal way to deploy my Node API. With Docker and ECS aside, I've wanted to take advantage of the Node cluster library to gracefully handle crashing the node app in the event of an asynchronous error as suggested in the documentation, by creating a master process and multiple worker processors.
One of the benefits of the cluster approach, besides gracefully handling errors, is creating a worker processor for each available CPU. But does this make sense in the docker world? Would it make sense to have multiple node processes running in a single docker container that was going to be scaled into a cluster of EC2 instances on ECS?
Without the Node cluster approach, I'd lose the ability to gracefully handle errors and so I think that at a minimum, I should run a master and one worker processes per docker container. I'm still confused as to how many CPUs to define in the Task Definition for ECS. The ECS documentation says something about each container instance having 1024 units per CPU; but that isn't the same thing as EC2 compute units, is it? And with that said, I'd need to pick EC2 instance types with the appropriate amount of vCPUs to achieve this right?
I understand that achieving the most optimal configuration may require some level of benchmarking my specific Node API application, but it would be awesome to have a better idea of where to start. Maybe there is some studying/research I need to do? Any pointers to guide me on the path or recommendations would be most appreciated!
Edit: To recap my specific questions:
Does it make sense to run a master/worker cluster as described here inside a docker container to achieve graceful crashing?
Would it make sense to use nearly identical code as described in the Cluster docs, to 'scale' to available CPUs via require('os').cpus().length?
What does Amazon mean in the documentation for ECS Task Definitions, where it says for the cpus setting, that a container instance has 1024 units per CPU? And what would be a good starting point for the this setting?
What would be a good starting point for the instance type to use for an ECS cluster aimed at serving a Node API based on the above? And how do the available vCPUs affect the previous questions?
All these technologies are new and best practices are still being established, so consider these to be tips from my experience only.
One-process-per-container is more of a suggestion than a hard and fast rule. It's fine to run multiple processes in a container when you have a use for it, especially in this case where a master process forks workers. Just use a single container and allow it to fork one process per core, as you've suggested in the question.
On EC2, instance types have a number of vCPUs, which will appear as a core to the OS. For the ECS cluster use an EC2 instance type such as the c3.xlarge with four vCPUs. In ECS this translates to 4096 CPU units. If you want the app to make use of all 4 vCPUs, create a task definition that requires 4096 cpu units.
But if you're doing all this only to stop the app from crashing you could also just use a restart policy to restart the container if it crashes. It appears that restart policies are not yet supported by ECS though.
That seems like a really good pattern. It's similar to what is done with Erlang/OTP, and I don't think anyone would argue that it's one of the most robust systems on the planet. Now the question is how to implement.
I would leverage patterns from Heroku or other similar PaaS systems that have a little bit more maturity. I'm not saying that amazon is the wrong place to do this, but simply that a lot of work has been done with this in other areas that you can translate. For instance, this article has a recipe in it:
https://devcenter.heroku.com/articles/node-cluster
As far as the relationships between vCPU and Compute Units, it looks like it's just a straight ratio of 1/1024. It is a move toward microcharges based on CPU utilization. They are taking these even farther with the lambda work. They are charging you based on fractions of a second that you utilize.
In the docker world you would run 1 nodejs per docker container but you would run many such containers on each of your ec2 instances. If you use something like fig you can use fig scale <n> to run many redundant containers an an instance. This way you don't have to have to define your nodejs count ahead of time and each of your nodejs processes is isolated from the others.

Resources