htop command shows the CPU is 100% used even tho I do not have the app running or anything else. The DigitalOcean dashboard metric shows this same data (100% usage) as well.
The top tasks on the htop list take less than 10% CPU usage. The biggest is pm2 taking ~5.2 % usage.
Is it possible that there are hidden tasks that are not displaying on the list and, in general, how I can start investigating what's going on?
My droplet used this one-click installation:
https://marketplace.digitalocean.com/apps/nodejs
Thanks in advance!
Update 1)
The droplet has a lot of free disk space
I ran pm2 save --force to sync running processes and the CPU went back to normal.
I guess there was an app stuck or something that ate all the CPU.
Related
I have a problem with increasing kernel cpu usage on a web server I am running. On a 6 core cpu the kernel usage increases from 5 to 50% in some 8 hours.
I have noticed it takes less time when there are more active users on the site and I don't have this problem in dev, therefore I don't have any code that can reproduce the problem. I am hoping for some advice how to troubleshoot this though, what should I investigate to figure out what the problem is?
"pm2 restart" will take the cpu usage down so this is what I need to do every 8 hours or so. I have also noticed increasing cpu usage of systemd-resolved up to some 50% in 8 hours but restarting it with "systemctl restart systemd-resolved" will not help.
I am running it on ubuntu 20.04, node v12.19.0, next 9.5.3, express, express-session, express-socket.io.session, mongodb etc. I have had this problem even on older versions of all this though.
I've been struggling to run multiple instances of Puppeteer on DigitalOcean for quite some time with little luck. I'm able to run ~5 concurrently using tools like puppeteer-cluster, but for some reason the whole thing just chokes with little helpful messaging. So, I switched to spawning ~5 child processes without any additional library -- just Puppeteer itself. Same issue. Chokes with no helpful errors.
I'm able to run all of these jobs just fine locally, but after I deploy, I hit these walls. So, my hunch is that it's a resource/performance issue, but I can't say for sure.
I'm running a droplet with 1GB and 3CPUs on Digital Ocean.
Basically, I'm just looking for ways to start troubleshooting something like this. is there a way I can know for sure that I'm hitting resource walls? I've tried pm2 and the DO dashboard graphs, but I feel like those are all leaving a lot of information out, or else I'm missing something else altogether.
Author of puppeteer-cluster here. You are right, 1 GB of memory is likely not enough for running 5 browser windows (or tabs) in addition to your operating system and maybe even other background tasks.
Here is a list of resources you should check:
Memory: Use a tool like htop to check your memory usage while your application is running.
CPU: Again, you can use htop for that, 3 vCPUs should be more than enough for 5 windows.
Disk space: Use a tool like df to check if there is enough space on the disk. I know of multiple cases in which there was not enough space on the disk (like some old kernels filling the disk), and Chrome needs at least some space to run.
Network throughput: Rarely the problem, but sometimes the network just does not have the bandwidth to support many open browser. Use a tool like nload to check the network throughput.
To use htop or nload, you start your script in the background (node script.js &) or use a terminal multiplexer (like tmux). Resource problems should then be easy to spot.
Most probably you're running out of memory, 5 puppeteer processes are a lot for a 1GB VM.
You can run
grep -i 'killed process' /var/log/messages
to confirm that the OOM killer terminated your processes.
I have a service that I run daily in the background with a database of about 140mb in size. The calculations I run require me to load all 140mb into Node at once, and after a minute or so quickly reach the process limit of 512mb and Heroku restarts the process.
For the mean time, a quick solution is to increase the server to 2X so I get 1 GB RAM, but within a month or so the database will outgrow that as well.
As far as Heroku goes, is my option basically to upgrade Dyno options? Since these are calculations I do once per day, I would rather run them locally on my machine and upload the results than to pay $250-500/month for the Performance Dynos.
I know I could also just upgrade to the Performance Dynos to run these services and then downgrade once finished, but I'm looking for something I can automate and not have to deal with each day.
Thanks for reading.
Heroku Scheduler seems to fit your use case exactly. You can schedule your task to run daily on a One-Off Dyno of any size, and since Heroku pricing is "prorated to the second" you will only pay for the time that your task is running on that Dyno.
I haven't actually used this, but I was about to recommend a similar solution on AWS when I searched and found this feature of Heroku.
I have been trying to do some load testing (using BlazeMeter) on my node.js app running on an amazon ec2. I started testing with 500 users hitting my endpoint on the t2.Micro but this soon collapsed (maxing both memory and CPU). I tried similar test on the t2.Small, t2.Medium and c3.large, with all of these tests, the memory was fine but I would end up maxing CPU and eventually response times would get above 60 seconds and I would get a 504 Gateway timeout from nginx.
I have tried profiling my app with Nodetime but nothing looks very strange:
It doesn't seem like any of the listed tasks is using much CPU but top tells me that the CPU (or CPUS running on c3.large) are totally maxed, and this leaves me a bit confused.
Am i reading this wrong?
EDIT:
Here is a screenshot of top showing node maxxing the CPU:
Try profiling for more time or on time when the CPU goes high.
Probably your profile time did not catch the peak CPU usage
I am new to pm2 concept,I am facing problem where my cpu usage increases and reaches upto 100% memory and my server goes down resulting to crashing of website,so can anyone please consult me on this.Do I need to change the configuration of my production(live) server such as increasing memory?My code is also neccessary and sufficient.I am ec2 user.
The system requirements will mostly depend on your application which you told nothing about. If CPU reaches 100% then you likely have some tight loop that is actively adding delays by burning cycles synchronously or something like that. The 100% memory usage can mean memory leaks and in that case no RAM will be sufficient because leaking memory will use up all your RAM eventually, no matter how large it is.
You need to profile your application with real usage patterns on a system where that app works and only then you will know how much resources it needs. This is true for every kind of application.
Additionally if you notice that resources usage grown over time then it may be a sign of some resource leaking, like memory leaking, spawning processes that don't exit but use CPU and RAM, etc.
first of all i would like to suggest you to follow these guideline for production envoiremnt.
1) disable morgon if you enable it as a dev envoiremnt.
2) use nginx or pm2 for load balancing.
or you can easily handle load balancing by using this command
pm2 start server.js -i 10
3)handle uncaugh exception. ie:
process.on("uncaughtException".function (err){
//do error handling
})