I am running an express.js application, which is used as a REST api. One endpoint starts puppeteer and test my website with several procedures.
After starting the application and the continuous consumption of the endpoint, my docker container runs out of memory every hour as you can see below.
First, I thought I have a memory leak in my puppeteer / headless chrome, but I then I monitored the memory usage from the processes, there isn't and memory leak visible as you can see here:
0.00 Mb COMMAND
384.67 Mb /var/express/node_modules/puppeteer/.local
157.41 Mb node /var/express/bin/www
101.76 Mb node /usr/local/bin/pm2
4.34 Mb /var/express/node_modules/puppeteer/.local
1.06 Mb ps
0.65 Mb bash
0.65 Mb bash
0.31 Mb cut
0.31 Mb cut
0.13 Mb dumb
Now, I ran out of ideas what the problem could be. Has anyone an idea where the RAM consumption could came from?
Analyse the problem more
You need to monitor the activity real time.
We do not have the code, thus we cannot even know what is going on. However, you can use more advanced tool like htop, gtop, netdata and others than top or ps.
The log on pm2 might also tell you more about things. On such situation, the logs will have more data than the process manager. Make sure to thoroughly investigate the logs to see if scripts are responsible, and throwing errors or not,
pm2 logs
Each api call will cost you
Calculate the cost early and prepare accordingly,
If you have 1 call, then be prepared to have 100Mb-1GB or more each time. It will cost you just like a browser tab. The cost will be there as long as the tab is open.
If the target website is heavy, then it will cost more. Some websites like Youtube will obviously cost you more.
Any script running inside the browser tab will cost cpu and memory usage.
Say each process is causing 300MB ram, If you don't close the process properly and start making API calls, then only 10 API call will be able to use 3GB ram pretty easily. It can add up pretty quickly.
Make sure to close the tabs
Whether the automation task is successful or not, make sure to properly use browser.close() to ensure the resource it is using gets free. Most of time we forget about such small things and it costs us.
Apply dumb init on docker to avoid ghost process
Something like dumb-init or tini can be used if you have a process that spawns new processes and you doesn't have good signal handlers implemented to catch child signals and stop your child if your process should be stopped etc.
Read more on this SO answer.
I've got the problem solved. It was caused by the underlying Kubernetes system, which wasn't configured with a resource limit on that specific container. Therefore, the container can consume as many memory as possible.
Now I've limited at 2GB an it looks like this:
Related
I've been struggling to run multiple instances of Puppeteer on DigitalOcean for quite some time with little luck. I'm able to run ~5 concurrently using tools like puppeteer-cluster, but for some reason the whole thing just chokes with little helpful messaging. So, I switched to spawning ~5 child processes without any additional library -- just Puppeteer itself. Same issue. Chokes with no helpful errors.
I'm able to run all of these jobs just fine locally, but after I deploy, I hit these walls. So, my hunch is that it's a resource/performance issue, but I can't say for sure.
I'm running a droplet with 1GB and 3CPUs on Digital Ocean.
Basically, I'm just looking for ways to start troubleshooting something like this. is there a way I can know for sure that I'm hitting resource walls? I've tried pm2 and the DO dashboard graphs, but I feel like those are all leaving a lot of information out, or else I'm missing something else altogether.
Author of puppeteer-cluster here. You are right, 1 GB of memory is likely not enough for running 5 browser windows (or tabs) in addition to your operating system and maybe even other background tasks.
Here is a list of resources you should check:
Memory: Use a tool like htop to check your memory usage while your application is running.
CPU: Again, you can use htop for that, 3 vCPUs should be more than enough for 5 windows.
Disk space: Use a tool like df to check if there is enough space on the disk. I know of multiple cases in which there was not enough space on the disk (like some old kernels filling the disk), and Chrome needs at least some space to run.
Network throughput: Rarely the problem, but sometimes the network just does not have the bandwidth to support many open browser. Use a tool like nload to check the network throughput.
To use htop or nload, you start your script in the background (node script.js &) or use a terminal multiplexer (like tmux). Resource problems should then be easy to spot.
Most probably you're running out of memory, 5 puppeteer processes are a lot for a 1GB VM.
You can run
grep -i 'killed process' /var/log/messages
to confirm that the OOM killer terminated your processes.
I have set up an Azure WebApp (Linux) to run a WordPress and an other handmade PHP app on it. All works fine but I get this weird CPU usage graph (see below).
Both apps are PHP7.0 containers.
SSHing in to the two containers and using top I see no unusual CPU hogging processes.
When I reset both apps the CPU goes back to normal and then starts to raise slowly as shown below.
The amount of HTTP requests to the apps has not relation to the CPU usage at all.
I tried to use apache2ctl to see if there are any pending requests but that seems not possible to do inside a docker container.
Anybody got an idea how to track down the cause of this?
This is the top output. The instance has 2 cores. Lots of idle time but still over 100% load and none of the processes use the CPU ...
After handling with MS Support on that issue it seems to have boiled down to the WordPress theme being to slow or inefficient. Each request took very long and hogged CPU resources. All following requests started queuing up and thus increasing the CPU load.
Why that would not show as %CPU in top I was not explained.
They proposed to use a different theme or upscale to a multi core instance.
I am unsatisfied with that solution and will monitor further and try to find the real culprit.
I had almost exactly the same CPU Percentage graph as you did, although a Node.JS app instead of PHP. Disabling Diagnostic Logs > Docker Container Logging seems to have solved the problem for me.
I do not need those logs because I am logging to application insights.
But, in your case you might need more of those logs. I have no solution for that, but I am guessing that heavier log rotation or reducing the sizes of the logs by other means might help
Whenever we are making a API call to our script it gets completed successfully, but after
the end of the script script, memory doesn't get released. Lets say if there was 10MB memory was
used up during execution then after execution memory usage should have come done atleast by 5 MB
but it is not happening.
So after a certain amount of time memory usage go beyond the 75% usage and we start getting alerts.
Docker version 1.11.2, build b9f10c9/1.11.2
Python3.4.2
flask
We motoring usage with docker stats command
Found this solution and it really helped.
This issue was due to linux and python. Python was releasing the memory but
linux thought flask is still running(caller of process) so it should not release that memory and due to this memory was not getting released.
http://www.paulsprogrammingnotes.com/2014/10/large-dictionaries-not-released-from.html?showComment=1483516233443#c3352147816385844344
Background
I have a relatively simple node js application (essentially just expressjs + mongoose). It is currently running in production on an Ubuntu Server and serves about 20,000 page views per day.
Initially the application was running on a machine with 512 MB memory. Upon noticing that the server would essentially crash every so often I suspected that the application might be running out of memory, which was the case.
I have since moved the application to a server with 1 GB of memory. I have been monitoring the application and within a few minutes the application tends to reach about 200-250 MB of memory usage. Over longer periods of time (say 10+ hours) it seems that the amount keeps growing very slowly (I'm still investigating that).
I have been since been trying to figure out what is consuming the memory. I have been going through my code and have not found any obvious memory leaks (for example unclosed db connections and such).
Tests
I have implemented a handy heapdump function using node-heapdump and I have now enabled --expore-gc to be able to manually trigger garbage collection. From time to time I try triggering a manual GC to see what happens with the memory usage, but it seems to have no effect whatsoever.
I have also tried analysing heapdumps from time to time - but I'm not sure if what I'm seeing is normal or not. I do find it slightly suspicious that there is one entry with 93% of the retained size - but it just points to "builtins" (not really sure what the signifies).
Upon inspecting the 2nd highest retained size (Buffer) I can see that it links back to the same "builtins" via a setTimeout function in some Native Code. I suspect it is cache or https related (_cache, slabBuffer, tls).
Questions
Does this look normal for a Node JS application?
Is anyone able to draw any sort of conclusion from this?
What exactly is "builtins" (does it refer to builtin js types)?
I am new to pm2 concept,I am facing problem where my cpu usage increases and reaches upto 100% memory and my server goes down resulting to crashing of website,so can anyone please consult me on this.Do I need to change the configuration of my production(live) server such as increasing memory?My code is also neccessary and sufficient.I am ec2 user.
The system requirements will mostly depend on your application which you told nothing about. If CPU reaches 100% then you likely have some tight loop that is actively adding delays by burning cycles synchronously or something like that. The 100% memory usage can mean memory leaks and in that case no RAM will be sufficient because leaking memory will use up all your RAM eventually, no matter how large it is.
You need to profile your application with real usage patterns on a system where that app works and only then you will know how much resources it needs. This is true for every kind of application.
Additionally if you notice that resources usage grown over time then it may be a sign of some resource leaking, like memory leaking, spawning processes that don't exit but use CPU and RAM, etc.
first of all i would like to suggest you to follow these guideline for production envoiremnt.
1) disable morgon if you enable it as a dev envoiremnt.
2) use nginx or pm2 for load balancing.
or you can easily handle load balancing by using this command
pm2 start server.js -i 10
3)handle uncaugh exception. ie:
process.on("uncaughtException".function (err){
//do error handling
})