Are there anyways to run a process inside a Docker container without building the container plus all the other isolations (IO, etc)?
My end goal is not to build an isolated environment, but rather limit CPU and Memory usage (ability to malloc). And using VM instances is just a too bit overhead. The ulimit, systemd, cpulimit, and other Linux tools doesn't seem to provide a good solution here. For example it seems that systemd only kills the process if RES/VIRT exceeds threshold.
Docker seems to do the trick without performance degradation, but are there any simple methods to run e.g. a python script without all the extra hassle and configurations?
Or are there any other ways to limit CPU and Mem usage?
Related
I need to run a Bach script on a Linux machine and need to limit the resources usage (RAM and CPU).
I am using cgroups but it does kill the process when exceeding the limits, but I dont want that, I just want the process to keep running with the maximum amout of memory and CPU I gave to it without it being killed.
Any solution for that? Or is it possibile to configure cgroups for the case?
Thank you
I've been struggling to run multiple instances of Puppeteer on DigitalOcean for quite some time with little luck. I'm able to run ~5 concurrently using tools like puppeteer-cluster, but for some reason the whole thing just chokes with little helpful messaging. So, I switched to spawning ~5 child processes without any additional library -- just Puppeteer itself. Same issue. Chokes with no helpful errors.
I'm able to run all of these jobs just fine locally, but after I deploy, I hit these walls. So, my hunch is that it's a resource/performance issue, but I can't say for sure.
I'm running a droplet with 1GB and 3CPUs on Digital Ocean.
Basically, I'm just looking for ways to start troubleshooting something like this. is there a way I can know for sure that I'm hitting resource walls? I've tried pm2 and the DO dashboard graphs, but I feel like those are all leaving a lot of information out, or else I'm missing something else altogether.
Author of puppeteer-cluster here. You are right, 1 GB of memory is likely not enough for running 5 browser windows (or tabs) in addition to your operating system and maybe even other background tasks.
Here is a list of resources you should check:
Memory: Use a tool like htop to check your memory usage while your application is running.
CPU: Again, you can use htop for that, 3 vCPUs should be more than enough for 5 windows.
Disk space: Use a tool like df to check if there is enough space on the disk. I know of multiple cases in which there was not enough space on the disk (like some old kernels filling the disk), and Chrome needs at least some space to run.
Network throughput: Rarely the problem, but sometimes the network just does not have the bandwidth to support many open browser. Use a tool like nload to check the network throughput.
To use htop or nload, you start your script in the background (node script.js &) or use a terminal multiplexer (like tmux). Resource problems should then be easy to spot.
Most probably you're running out of memory, 5 puppeteer processes are a lot for a 1GB VM.
You can run
grep -i 'killed process' /var/log/messages
to confirm that the OOM killer terminated your processes.
My backend is a nodejs application running in ubuntu linux. It needs to spawn a nodejs sub process when there is a request from client. The sub process usually takes less than 20 seconds to finish. There is a need to manage these processes if there are many concurrent requests come in. I am thinking to move spawn process inside docker container. That means a new docker container will be created to run the process if there is a request from client. In this way, I can use kubernetes to manage these docker containers. I am not sure whether this is a good design. Whether put the process inside docker container cause any performance issue.
The reason I am thinking to use docker container instead of spawn is that kubernetes offers all the features to manage these containers. Such as, auto scale if there are too many requests, limit the cpu and memory of the docker container, scheduler, monitoring, etc. I have to implement these logic if I use spawn.
You can easily measure the overhead yourself: get any basic docker image (e.g. a Debian base image) and run
time bash -c true
time docker run debian bash -c true
(Run each a few times and ignore the first runs.)
This will give you the startup and cleanup costs. During actual runtime, there is negligible/no further overhead.
Kubernetes itself may add some more overhead - best measure that too.
From the Docker documentation on network settings:
Compared to the default bridge mode, the host mode gives significantly better networking performance since it uses the host’s native networking stack whereas the bridge has to go through one level of virtualization through the docker daemon. It is recommended to run containers in this mode when their networking performance is critical, for example, a production Load Balancer or a High Performance Web Server.
So, answers which say there is no significant performance difference are incorrect as the Docker docs themselves say there is.
This is just in the case of network. There may or may not be impacts in accessing disk, memory, CPU, or other kernel resources. I'm not an export on Docker, but there are other good answers to this question around, for example here, and blogs detailing Docker-specific performance issues.
Ultimately, it will depend on exactly what your application does as to how it is impacted. The best advice will always be that, if you're highly concerned about performance, you should set your own benchmarks and do your own testing in your environment. That doesn't answer your question because there is no generic answer. Importantly, though, "there's virtually no impact" does not appear to be correct.
docker is in fact just wrapper on core functionaity of linux itself so there is no significant impact - it is just separaing process in container. so question is more about levels of virtualisation in your host. If it is linux in windows, or docker on windows it can affect your app somehow and virtualisation is a heavy way then. docker let you separate dependencies without almost any impact on performance.
i have a very odd issue on one of my openvz containers. The memory usage reported by top,htop,free and openvz tools seems to be ~4GB out of allocated 10GB.
when i list the processes by memory usage or use ps_mem.py script, i only get ~800MB of memory usage. Similarily, when i browse the process list in htop, i find myself unable to pinpoint the memory hogging offender.
There is definitely a process leaking ram in my container, but even when it hits critical levels and i stop everything in that container (except for ssh, init and shells) i cannot reclaim the ram back. Only restarting the container helps, otherwise the OOM starts kicking in in the container eventually.
I was under the assumption that leaky process releases all its ram when killed, and you can observe its misbehavior via top or similar tools.
If anyone has ever experienced behavior like this, i would be grateful for any hints. The container is running icinga2 (which i suspect for leaking ram) , although at most times the monitoring process sits idle, as it manages to execute all its scheduled checks in more than timely manner - so i'd expect the ram usage to drop at those times. It doesn't though.
I had a similar issue in the past and in the end it was solved by the hosting company where I had my openvz container. I think the best approach would be to open a support ticket to your hoster, explain them the problem and ask them to investigate. Maybe they use some outdated kernel version or they did changes on the server that have impact on your ovz container.
I have a C/C++ application that crashes only under heavy loads. I normally use valgrind and gprof to debug memory leaks and profiling issues. Failure rate is like about 100 in a million runs. This is consistent.
Rather than reproduce the traffic to my application, can I superficially limit the resources available to the debug build of the application running within valgrind somehow?
ulimit can be used from bash to set hard limits on some resources.
Note that in Linux only some of the memory ulimits actually work.
For example, I don't think ulimit -d which is supposed to limit the data segment (which I think is RSS) really works.
As I recall from my experience with trying to keep Evolution (the email client) under control, ulimit -v (the virtual memory) was the only one that worked for me.
It sounds like it could be a race condition - have you tried the 'helgrind' valgrind tool?