How to prevent a Node.js API REST app to be unresponsive on high CPU/RAM usage? - node.js

I'm building a Node.js + Express API REST app. So long the CPU and RAM usage is on normal levels, but one of the latest endpoints designed is taking up too much RAM and CPU:
I'm talking about an API whose goal is to generate .pdf files in real time using template data (we're using the library pdf-puppeteer). But when this API is tasked to generate hundreds of .pdf's the Node.js application goes unresponsive and we cannot call the other API's as they give either a timeout error or they take too long to give a response, even the simpler ones.
I'm using pm2 for load balancing, and we've tried to delegate the pdf creation process to worker processes so the event loop doesn't get blocked. And it was succesful to some extent, but still the CPU and RAM consumption is very high and the API's start to get unresponsive neverthless.
So how can this high CPU and RAM usage be prevented on heavy processes, so the application doesn't get unresponsive? Maybe using a throttling approach?

You can use Docker/Kubernetes stack. And scale up your environment.

Related

node.js CPU usage spikes

I have an express.js app running in cluster mode (nodejs cluster module) on a linux production server. I'm using PM2 to manage this app. It usually uses less than 3% CPU. However, sometimes the CPU usage spikes up to 100% for a short period of time (less than a minute). I'm not able to reproduce this issue. This only happens once a day or two.
Is there any way to find out which function or route is causing the sudden CPU spikes, using PM2? Thanks.
i think have some slow synchronous execution on some request in your application.
add log every request income on middleware and store to elastic search and find what request have long response time or use newrelic (easy way but spend more money).
use blocked-at to find slow synchronous execution if detect try to use worker threads or use lib workerpool
My answer is based purely on my experience with the topic
Before going to production make local testing like:
stress testing.
longevity testing.
For both tests try to use tool like JMeter where you can put your one/multiple endpoints and run loads of them in period of time while monitoring CPU & MEMORY Usage.
If everything is fine, try also to stop the test and run the api manually try to monitor its behavior, this will help you if there is
memory leak from the APIs themselves
Is your app going through .map() , .reduce() for huge arrays?
Is your app is working significantly better after reboot?
if yes, then you need to suspect that the express app experiencing memory leak and Garbage collector trying to clean the mess.
If it's possible, try to rewrite the app using fastify, personally, this did not make the app much faster, but able to handle 1.5X more requests.

NodeJs slowing down when process consuming big amount of memory

So, i have a nodejs process which is, when scaled consumes around 5-8gb of RAM. It is running within the docker container. launching with the arg --max-old-space-size=12192 to increase the node process limit.
The memory consumption is OK, since i try to use a dedicated server (AMD EPYC CPU, 64GB memory) in place of the horizontal scaling with AWS or other cloud provider, because it is 10x cheaper in case if i can make it work on dedicated server (most of expenses for AWS/Google Cloud goes for network traffic, while the VDS have unlimited. The network side is already optimised with the use of GraphQL and minimising the amount of requests). The process itself processes huge amount of data in memory, in multithreaded fashion. There is no further significant optimisation from the side of the process code itself.
When the process memory consumption reaches 3Gb+, it is significantly slowing down. Docker is not limiting the container resources. The server itself is running on 5-10% load in terms of memory and CPU. SSD driver -> low drive load (low amount of I/O on the server side).
I guess re-writing the app to golang for example might improve it significantly, however that is really a lot of work.
Anything can be done on the server setup / nodejs app side to prevent slowing down?
Thanks!
Solved by horizontal scaling within a single VDS with docker.
As jfriend00 noticed, the nature of the problem might be with the garbage collection - personally have no other guess.

ASP.NET Core 2.2 experiencing high CPU usage

So I have hosted asp.net core 2.2 web service on Azure(S2 plan). The problem is that my application sometimes getting high CPU usage(almost 99%). What I have done for now - checked process explorer on azure. I see there a lot of processes who are consuming CPU. Maybe someone knows if it's okay for these processes consume CPU?
Currently, I don't have an idea where do they come from. Maybe it's normal to have them here.
Shortly about my application:
Currently, there is not much traffic. 500-600 request in a day. Most of the request is used to communicate with MS SQL by querying records, adding, etc.
As well I am using MS Websocket, but high CPU happens when no WebSocket client is connected to web service, so I hardly believe that it's a cause. I tried to use apache ab for load testing, but there isn't any pattern, that after one request's load test, I would get high CPU. So sometimes happens, sometimes don't during load testing.
So I just update screenshot of processes, I see that lots of threads are being locked/used during the time when fluent migrator start running its logging.
Update*
I will remove fluent migrator logging middleware from Configure method. Will look forward with the situation.
UPDATE**
So I removed logging of FluentMigrator. Until now I didn't notice any CPU usage over 90%.
But still, I am confused. My CPU usage is spinning. Is it health CPU usage graph or not?
Also, I tried to make a load test on the websocket server.
I made a script that calls some functions of WebSocket every 100ms from 6-7 clients. So every 100ms there are 7 calls to WebSocket server from different clients, every function within itself queries some data/insert (approximately 3-4 queries of every WebSocket function).
What I did notice, on Azure S1 DTU 20 after 2min I am getting out of SQL pool connections, If I increase DTU to 100, it handles 7 clients properly without any errors of 'no connection pool'.
So the first question: is it a normal CPU spinning?
Second: should I get an error message of 'no SQL connection free' using this kind of load test on DTU 10 Azure SQL. I am afraid that when creating a scoped service on singleton WebSocket Service I am leaking connections.
This topic gets too long, maybe I should move it to a new topic?
-
At this stage I would say you need to profile your application and figure out what areas of your code are CPU intensive. In the past I have used dotTrace, this highlighted methods which are the most expensive with a call tree.
Once you know what areas of your code base are the least efficient, you can begin to refactor them so that they are more efficient. This could simply be changing some small operations, adding caching for queries or using distributed locking for example.
I believe the reason the other DLLs are showing CPU usage is because your code calling methods which are within those DLLs.

For a node web server, is it better to have more vCPUs or RAM

I am running a node app on a Digital Ocean cloud server, and the app merely services API requests. All client-side assets are served by a CDN, and the DB is accessed remotely, rather than stored on the server instance itself.
I have the choice of a greater number of vCPUs or RAM. I have no idea what that means in any way, so any feedback is a great help.
A single node.js server will run your Javascript on only one CPU so it doesn't help your Javascript run any faster to have more CPUs unless you cluster your app and run multiple node.js processes sharing the load of your app or unless there are other processes on the same server that are being used by your server.
Having more RAM (memory) will only improve things if you actually need more RAM. That depends entirely upon what the memory usage profile is of your app and how much RAM you already have available. Probably, you would already know if you were running out of RAM because you either get drastic slow-down when the OS starts page swapping or your process crashes when out of memory.
So, in order to know which would benefit you more, you really need more data on how your existing app is performing (whether it is ever bog down with CPU intensive operations and how much RAM it uses compared to how much you have available). It is quite possible that neither will actually matter to you - it totally depends upon the usage profile or your server process.
If you have no more data than this and have to make a choice, choose the vCPUs because there are some circumstances where it might help you (and gives you the option to go to clustering in the future if needed) whereas adding more RAM when you aren't even using what you already have won't help you at all.

How to increase thread count /parallelism in Azure web applications

I have one asp.net web app running on Azure. It can receive users REST request, and process the data and send it back.
I have setup Scale up for Azure instances if the CPU percentage goes upto 60%, and Scale down if CPU percentage moves below 40%.
One thing I observed that, My tasks are most IO bound not CPU intensive tasks. So if the number of request increases then my app is keeping all the requests waiting.
How can I make my App to process 100's of request at once ?
I am observing that my app can process 2 request in parallel, Is there a way to increase this parallelism ?
How to setup Azure Scaling based upon the number of incoming requests.
I would recommend making sure that you are correctly using async/await in applicable areas. If your web app is performing some long-running operation (such as a database call), you should definitely be using async/await to free up threads to service other requests.
Second, even a small instance should be able to handle more than two requests at a time.
Third, you can set up custom rules to scale your web application on metrics other than CPU utilization. Disk queue length is another metric that you might find useful to scale on.

Resources