I have 4 ec2 instances running on AWS. PM2 is running in cluster mode on all instances. When I get 5K+ Concurrent request, response time of app increases significantly.
All requests fetch redis key, and a normal fetch takes upto 10 seconds which without so many concurrent requests takes only 50ms. What can be issue here?
We need to pinpoint the bottleneck. Let's do some diagnostics:
Are the EC2 instances multicore to take advantage of PM2's clustering?
When you execute pm2 start app.js -i X are you sure X=number_of_vCPUs of EC2 instance?
When you execute pm2 monit do you see all instances of the cluster sharing the equal CPU and memory usage?
When you run htop what is your total CPU and memory usage %?
When you execute iftop what is your total of your RX and TX traffic compared to the maximum available in your machine?
Related
We have Django application that run with gunicorn. We ware facing site downtime because of high trafic. Gunicorn is utilizng 96 % of the cpu that what cauing the issue.
Our system specification:
8 GB ram, 4 cpu
How to setup gunicron in such way that it can handle more than 100 request per second ?
What are the system specifcation required for handling 100 request per second ?
Is gunicorn workers are cpu count ?
Keep one worker only and increase number of threads in that worker. or use something like gevent in gunicorn
I am running Node application (Socket.io) in AWS EC2(2CPU-4gbRAM) with Loadbalancer and Autoscaling setup, I've set Systemd for the Node service and it runs fine for a day with only 1% CPU the next day it's not working.
When I check details with TOP Node CPU utilization is 100%, but it shows only 50% usage in AWS Monitor and it stands there.
From various references, I'm not able to find the area of high consumption, root cause.
Please help me to figure out why Node hit only one CPU and what's the cause of high consumption.
From cat /proc/PID/status
Thanks in advance!
I have a simple stateless Node app that I want to instantiate across a multi-core (multi-vCPU AWS instance) server, and I understand how PM2's cluster mode works to obviate the need for using the Cluster module in the app code.
I have a dual core AWS t2.medium EC2 instance, PM2 I believe is configured correctly and at startup it invokes two processes for the app with distinct PM2 IDs and PIDs.
PM2 is starting the app as follows:
pm2 start [app_name] -i max
PM2 lists the two processes with distinct PM2 IDs and distinct PIDs as expected.
However...
ps -U [username] -au
...suggests both processes are running on the same core.
Am I missing something? (Probably!)
Thanks in advance to anyone who can shed some light on this.
Processes aren't bound to cores, but rather assigned by the OS's scheduler. When your clustered program comes under load, the OS will use both cores to schedule your processes and, of course, all the other stuff it needs to run.
We are running a Koa web app in 5 Fargate containers. They are pretty straightforward crud/REST API's with Koa over Mongo Atlas. We started doing capacity testing, and noticed that the node servers started to slow down significantly with plenty of headroom left on CPU (sitting at 30%), Memory (sitting at or below 20%), and Mongo (still returning in < 10ms).
To further test this, we removed the Mongo operations and just hammered our health-check endpoints. We did see a lot of throughput, but significant degradation occurred at 25% CPU and Node actually crashed at 40% CPU.
Our fargate tasks (containers) are CPU:2048 (2 "virtual CPUs") and Memory 4096 (4 gigs).
We raised our ulimit nofile to 64000 and also set the max-old-space-size to 3.5 GB. This didn't result in a significant difference.
We also don't see significant latency in our load balancer.
My expectation is that CPU or memory would climb much higher before the system began experiencing issues.
Any ideas where a bottleneck might exist?
The main issue here was that we were running containers with 2 CPUs. Since Node only effectively uses 1 CPU, there was always a certain amount of CPU allocation that was never used. The ancillary overhead never got the container to 100%. So node would be overwhelmed on its 1 cpu while the other was basically idle. This resulted in our autoscaling alarms never getting triggered.
So adjusted to 1 cpu containers with more horizontal scale out (ie more instances).
In learning about Node.js's cluster module I've been turning over the following architecture in my head: Balancing costs with performance, would it be more beneficial (i.e. cheapest but still scalable) to run your Node.js application in a cloud service's autoscaling group using small servers with one virtual CPU (say, AWS's t2.small EC2, 1 vCPU, 2gb memory) or use a larger server (say, an m5.xlarge 4 vCPU, 16gb memory), run Node.js to cluster four child processes to use the 4 vCPUs, but still autoscale?
A possible trade-off is the time it takes AWS to deploy another small server to autoscale, but on a low-traffic app or utility app you'll have to take on the cost of running the larger server when usage is low. But if the time it takes to spin up another server to handle the load is nominal, does that negate the benefits of using the cluster module?
Specifically, my question is twofold: Are these two approaches feasible and, if so, is my presumption about the cluster module's usefulness in the small server approach correct?