Docker Memory usage and Server stuck - linux

We are getting an issue on Ubuntu EC2 instance, we are running 8 docker container on Amazon EC2 instance which is having 30 GB memory and 16 cores. What we observed our EC2 being stuck after 3-4 days of reboot and in the logs we are getting Memory issues. When we monitored to docker stats command it shows every Docker is running on less than 2GB (Total used 16G) on the EC2 when we ran the free –g command it shows the used memory is around 16-17GB and other memory is in buff / cache and Free is having 0.
Please let us know There is any issue or we have to do any configuration. I tried to drop the cache but it got filled in 10 minutes.
Versions –
Ubuntu – 16.04 ( xenial)
Docker - 17.09.0-ce
Please let me know if required more details for troubleshooting.
Thanks,

Related

How can I increase the CPUs and Memory available to Docker in Linux?

When I run docker info in my Amazon Linux AMI EC2 instance I get:
Operating System: Amazon Linux AMI 2017.09
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 993.4MiB
How can I increase the number of CPUs and memory available to docker? With only 1 CPU my docker container is exiting immediately with an exit code of 137 (which I understand means out of memory).
I know how to change these setting on my Mac Docker instance, but not on the Linux instance.
Thanks!
Turns out my problem was simply that the Amazon t2.micro instance I was using simply didn't have the necessary cpus/memory available.
Upgraded to t2.medium which has 2CPUs and 4GB of memory and that fixed it.

[WHM][CPANEL] xml-api - get_update_availability locks server

At random moments our Cpanel server completely hangs itself. We see huge spikes in load (150), disk io's and IO wait 100%
Finally I've identified the responsible processes and it seems that Cpanel/WHM executes a lot of these:
xml-api - get_update_availability -json ./get_update_availability
xml-api - get_current_lts_expiration_s -json ./get_current_lts_expiration_status
Killing them one by one unlocks the server and it starts running again.
Does anybody have a idea what this is and what causes the spawn of so many processes?
CENTOS 7.3 x86_64 xen hvm
cPanel & WHM 64.0 (build 18)
also running csf/lfd

Docker build slow on EC2 (Amazon Linux)

I provisioned an instance from Amazon Machine Image based on Amazon Linux (amzn-ami-2016.03.c-amazon-ecs-optimized). While attempting to do a docker build for my project, I find the process extremely slow, even for simple tasks like setting environment variables ENV TEST_PORT=3000 etc. A build that takes less than 5 minutes on my local machine has been running for at least an hour.
Running docker info returns Storage as devicemapper and this article suggests switching to aufs but it is for Ubuntu. I also have an EBS volume attached to my instance, how do I switch docker to use that instead? Will that fix this problem?
I experienced the same problem : each simple step of the Dockerfile (like ENV or ARG) is taking one or twe seconds on my Amazon Linux EC2 box.
To solve this, I have to :
upgrade Docker to version 17.03.2-ce
switch the overlay driver of docker, as suggested by https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/ . There is a dedicated section for CentOS.
I created /etc/docker/daemon.json with the following content :
{
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
stop and start the docker daemon.
Now each simple step are very fast.

ENOSPC error when deploying Meteor 1.3 app to Bluemix

Meteor 1.3 was released a few days ago.
Deployment using cf push of Meteor 1.3 apps to IBM Bluemix may fail with ENOSPC error. It appears that only simple Meteor 1.3 apps deploy successfully, as only after removing many packages or files (or creating a new app without adding many packages) it works.
ENOSPC means no space left or the number of files watched has reached the maximum allowed.
I think the latter may be the case. A solution for Node.JS apps is to increase the limit as described in Node.JS Error: ENOSPC:
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
However an attempt to use sudo in Bluemix returns
sudo: no tty present and no askpass program specified
A GitHub issue was created in the cf-meteor-buildpack repository that show a number of other people experiencing the same problem:
https://github.com/cloudfoundry-community/cf-meteor-buildpack/issues/20
How can Meteor 1.3 apps be successfully deployed to Bluemix?
Update:
I have found the point at which just adding one more package to the app and deploying it causes the ENOSPC error. Without that package, the app deploys and runs successfully and the total disk usage as shown in the Bluemix Dashboard is just 220.1 MB. The default disk quota is 1 GB, so there is actually plenty of free space left. Therefore it is likely that the error is caused by exceeding a maximum number of files, not the disk quota being reached.
Just to be sure, I increased the disk quota to 2 GB via Cloud Foundry's manifest.yml and the ENOSPC error still occurred when I added the extra package. Without the package, the disk usage is "220.1 MB / 2 GB" in the dashboard. That package (kadira:flow-router) is very unlikely to need 1.8 GB of space.
Meteor 1.3 creates many more files than previous versions because there is a new node-modules directory into which many individual Node.js modules are installed.
To check the value of max_user_watches, I added this line to the buildpack script:
cat /proc/sys/fs/inotify/max_user_watches
The result shown is 8192.
I then added this line to attempt to change the value:
echo 32768 > /proc/sys/fs/inotify/max_user_watches
The result is /proc/sys/fs/inotify/max_user_watches: Permission denied.
Is it possible to increase the maximum number of allowed file watches via Cloud Foundry? The operating system is Linux.

Docker + Cassandra ulimit error

I am trying to start a cassandra (not dsc) server on Docker (ubuntu 14.04). When I run service cassandra start (as root), I get
/etc/init.d/cassandra: 82: ulimit: error setting limit (Operation not permitted)
line 82 of that file is
ulimit -l unlimited
I'm not really sure what I need to change it to.
I would expect you would get that warning but that Cassandra would continue to start up and run correctly. As pointed out in the other answer, Docker restricts certain operations for safety reasons. In this case, the Casssandra init script is trying to allow unlimited locked memory. Assuming you are running with swap disabled (as it is a Cassandra best practice) then you can safely ignore this error.
I run Cassandra in Docker for my development environment and also get this warning, but Cassandra starts and runs just fine. If it is not starting up, check the cassandra log files for another problem.
A short intro into ulimit: RESOURCE LIMITS ON UNIX SYSTEMS (ULIMIT).
The command this init script is trying to issue is supposed to set the max locked memory limit to, well, unlimited. Should succeed for root. Does whoami print root?
UPD: further research led me to this Google Groups discussion. Hopefully it will clarify things a bit.
/etc/init.d/cassandra start/restart/status will not work because init system is not running inside the container so the available option is to restart the container
docker restart "container id or container name"

Resources