cgroup limit reached - no space left on device - linux

We have two servers running ubuntu 14.04 using docker. Every other month when starting or building a container we get the message:
container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused
\"mkdir /sys/fs/cgroup/memory/docker/cf657a58a1382e62976b4d339946f07e8a40f22f18b52822f884834f78830806: no space left on device\""
The disks have still lots of space but cat /proc/cgroups gives this: (num_cgroups keeps increasing)
#subsys_name hierarchy num_cgroups enabled
cpuset 1 65805 1
cpu 2 65807 1
cpuacct 3 65803 1
blkio 4 65803 1
memory 5 65535 1
devices 6 65805 1
freezer 7 65803 1
net_cls 8 65803 1
perf_event 9 65803 1
net_prio 10 65803 1
hugetlb 11 65803 1
Restarting the server always helped so far but we don't want to restart a server every few months.
So I started some research and found a directory in the /sys/fs/cgroup/*/user path.
/sys/fs/cgroup/systemd/user/998.user is itself holding 65662 subdirectories. All named somewhat like 36309.session (the number increases)
Is there a ways to see what process is creating those cgroups?
I thought it was process 998, but that doesn't even exists.

I ran into this same problem with AWS Batch. I have no solution but I found this discussion https://github.com/moby/moby/issues/29638. It seems that the problem is some kind of leak in kernel and/or Docker.

I encountered the same issue. You probably have a lot of dangling images/containers
which is causing the cgroup of docker to run out of space. check it by:
docker images -a
docker ps -a
You need to clean it up. One solution is to remove all images/containers/etc that are not being used at the moment:
docker system prune -a

Related

Ubuntu 16 gives "fork: retry: Resource temporarily unavailable", Ubuntu 20 doesn't

I have 2 similar machines in terms of HW. One has Ubuntu 16, the other Ubuntu 20.
I'm running a python program that is meant to open 30K TCP connections to an end point. The Ubuntu 20 machine machine was able to do the job well just by doing these 2 commands before executing the program:
#ulimit -n 1000000
#ulimit -u 1000000
However the Ubuntu 16 machine after creating 12K connections gives this:
-su: fork: retry: No child processes
-su: fork: retry: Resource temporarily unavailable
-su: fork: retry: Resource temporarily unavailable
-su: fork: retry: Resource temporarily unavailable
Any idea what may be causing Ubuntu 16 to behave like that while Ubuntu 20 seems fine?
Note: I tried to do few things now from different posts but none has worked.
Thanks in advance.
I think the maximum number of processes overall is lower on Ubuntu 16.04 vs 20.04
i.e. On Ubuntu 16.04 /proc/sys/kernel/pid_max is 32768, while on 18.04 it's 131072
I think you might have enough other processes to hit this limit, at least it's worth checking pid_max to see.
Also it would be better to write your test program to make many connections from a single process/thread using async code, as that would allow higher limits.
PS: Ubuntu 16.04 is end-of-life (unless you're paying for the extension), so you might want to ensure you upgrade.
PS: Ubuntu versions (except the minimial-types) need to second digit block to be correct.
OK the solution here was to stay away from Ubuntu 16

GlusterFS on Freebsd 11.1 / Mount issue

I want to use GlusterFS as a distributed Filestorage on FreeBSD 11.1
Documentation is poor, so I followed some howtos on the net.
I could create the glusterfs volume, but I have trouble to mount it on an other clients machine. Here is what I did so far:
I have three hosts, all in the same subnet.
10.0.0.21 Webserver
10.0.0.31 gluster1
10.0.0.32 gluster2
I added the above entries in the /etc/hosts files on all of the three hosts.
I modified /etc/rc.conf on gluster1 and gluster2 with:
glusterd_enable="YES"
on gluster1 I did:
gluster peer probe gluster2
(succeeded)
each gluster1 and gluster2 has the following harddrives: /dev/da1
they are partitioned (BSD Label) and mounted on gluster1 and gluster2 as /datastore
"cat /etc/fstab" gives on both gluster1 and gluster2:
# Device Mountpoint FStype Options Dump Pass#
/dev/da0a / ufs rw 1 1
/dev/da1a /datastore ufs rw 2 2
I created the gluster volume1:
gluster volume create volume1 replica 2 transport tcp gluster1:/datastore gluster2:/datastore force
(I'm aware of the split brain risk, this is a simple test szenario)
I started the volume1 with:
gluster volume start volume1
A check of the volume1 with:
gluster volume info
gives me back:
Type: Replicate
Volume ID: a760c545-1cc9-47a4-bc9e-51f6180e4d7a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gluster1:/datastore
Brick2: gluster2:/datastore
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
So far everything worked, and seems to be fine.
Now my trouble starts to mount and use this on the client / consumer machine (Webserver)
I read at several places that the glusterfs volume1 should be mountable with:
mount -t glusterfs gluster1:/volume1 /mnt
This gives me simple back the following error:
mount: gluster1:/volume1: Operation not supported by device
As I normally do before I ask "silly" questions, I googled a lot for this.
Played around with also installing glusterfs on the client (pkg install glusterfs), enabling it in the clients /etc/rc.conf, adding stuff for FUSE, but I could not bring it up to work.
I feel quite annoyed, because I know it must be a very small thing I'm missing here!?
Can anyone shed some light into my issue?
luster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gluster1:/datastore N/A N/A N N/A
Brick gluster2:/datastore N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A N 55181
Self-heal Daemon on gluster2 N/A N/A N 30318
Task Status of Volume volume1
------------------------------------------------------------------------------
There are no active volume tasks
So, I enabled NFS with this:
gluster volume set volume1 nfs.disable off
There was a warning of no longer using GlusterFS NFS, but instead to use NFS-Ganesha. The warning I ignored for this test.
now I restarted the volume:
gluster volume stop volume1
gluster volume start volume1
To check I did:
gluster volume info
which showed me now:
Volume Name: volume1
Type: Replicate
Volume ID: a760c545-1cc9-47a4-bc9e-51f6180e4d7a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gluster1:/datastore
Brick2: gluster2:/datastore
Options Reconfigured:
nfs.disable: off
transport.address-family: inet
So the nfs.disable was set to off. NFS should be on now right?
But
gluster volume status volume1
still shows no NFS running:
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gluster1:/datastore N/A N/A N N/A
Brick gluster2:/datastore N/A N/A N N/A
NFS Server on localhost N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A N 99115
NFS Server on gluster2 N/A N/A N N/A
Self-heal Daemon on gluster2 N/A N/A N 37075
Task Status of Volume volume1
------------------------------------------------------------------------------
There are no active volume tasks
Disturbing here is also (beside NFS Online is N), that both bricks seems to be not online too (Online indicated as N)?!??
So I'm really stuck and could use some help.
Finally it is working:
/usr/local/sbin/mount_glusterfs gluster1:/volume1 /mnt
did the trick...
the client also need to have the net/glusterfs package installed, and the following statement in the /boot/loader.conf:
fuse_load="YES"
Cheers
I think issue may be with ufs file system. Does it support extended attributes extensively ?
GlusterFS required FS with extended attribute support. (XFS is one).
From the link: (https://access.redhat.com/articles/1273933)
As the Red Hat Storage makes extensive use of extended attributes, an XFS inode size of 512 bytes works better with Red Hat Storage than the default XFS inode size of 256 bytes. So, inode size for XFS must be set to 512 bytes, while formatting the Red Hat Storage bricks. To set the inode size, you need to use -i size option with the mkfs.xfs command.

Docker Devmapper space issue - increase size

I have the same issue as in space issue on docker devmapper and CentOS7
It only specifies to clean up but not how I can increase the space and I dont have any images to clean. I tried several things with dm.min_free_space but nothing worked and want to increase the space.
OS Version/build: Red Hat Enterprise Linux Server release 7.3 (Maipo)
App version:
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 96d83a5/1.12.6
Built: Tue Mar 7 09:23:34 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 96d83a5/1.12.6
Built: Tue Mar 7 09:23:34 2017
OS/Arch: linux/amd64
Steps to reproduce
I have no containers running currently and have some docker images pertaining to Kubernetes which will be used by the Kubernetes service.
sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[kubeuser4#kubenode4 Employee]$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/busybox latest 00f017a8c2a6 5 days ago 1.11 MB
registry.access.redhat.com/rhel7/pod-infrastructure latest 34d3450d733b 6 weeks ago 205 MB
docker.io/java 8 d23bdf5b1b1b 8 weeks ago 643.1 MB
gcr.io/google_containers/heapster_grafana v2.6.0-2 b43443930626 12 months ago 230 MB
When I try to create a docker image of my application that needs to be used, I get the below error.
devmapper: Thin Pool has 8783 free data blocks which is less than minimum required 163840 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior
I tried the cleaning up as mentioned in the other forums, but not helped much and getting the same error. When I tried to run with this sudo docker --storage-opt dm.min_free_space=0%, seems like it starts as a daemon, but still it failed with another error "docker-runc not installed on system" and also I dont want to run it as a daemon.
Below are some command outputs
sudo dmsetup status
localvg00-lv_home: 0 20971520 linear
localvg00-lv_home: 20971520 20971520 linear
docker-251:5-134039-pool: 0 209715200 thin-pool 924 848/524288 1629226/1638400 - rw discard_passdown queue_if_no_space
localvg00-lv_tmp: 0 4194304 linear
localvg00-lv_swap: 0 8388608 linear
localvg00-lv_root: 0 2097152 linear
localvg00-lv_root: 2097152 20971520 linear
localvg00-lv_usr: 0 16777216 linear
localvg00-lv_var: 0 8388608 linear
localvg00-lv_var: 8388608 62914560 linear
sudo docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 4
Server Version: 1.12.6
Storage Driver: devicemapper
Pool Name: docker-251:5-134039-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 106.8 GB
Data Space Total: 107.4 GB
Data Space Available: 601.2 MB
Metadata Space Used: 3.473 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.144 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: overlay null bridge host
Swarm: inactive
Runtimes: runc docker-runc
Default Runtime: docker-runc
Security Options: seccomp
Kernel Version: 4.1.12-61.1.28.el7uek.x86_64
Operating System: Oracle Linux Server 7.3
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 2
Total Memory: 7.545 GiB
Name: kubenode4
I had also tried increasing all the physical volume size and logical volume size(lv_var) on my linux machine, but still it doesnt work.
sudo lvs
[sudo] password for kubeuser4:
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv_home localvg00 -wi-ao---- 20.00g
lv_root localvg00 -wi-ao---- 11.00g
lv_swap localvg00 -wi-ao---- 4.00g
lv_tmp localvg00 -wi-ao---- 2.00g
lv_usr localvg00 -wi-ao---- 8.00g
lv_var localvg00 -wi-ao---- 34.00g
sudo ls -lsh /var/lib/docker/devicemapper/devicemapper/data
2.3G -rw------- 1 root root 100G Mar 14 22:16 /var/lib/docker/devicemapper/devicemapper/data
Someone please let me know how it can be done.
Thanks,
It is better move away from devicemapper for a few reasons.
devicemapper in loopback unrecoverable storage issue: https://github.com/docker/docker/issues/3182 "devicemapper not recommended for production use".
I found it easy enough to switch to overlay storage driver, YMMV of course but hopefully not too much. 'rm -rf /var/lib/docker' is somewhat optional when switching but easy and I would highly recommend it as long as you can load your images back in. http://www.projectatomic.io/blog/2015/06/notes-on-fedora-centos-and-docker-storage-drivers/
systemctl stop docker
rm -rf /var/lib/docker
# if these files do not already exist . . . create them, otherwise you need to edit by hand, you can also just add -s overlay in the systemctl docker script
ls /etc/sysconfig/docker /etc/sysconfig/docker-storage
[[ $? != 0 ]] && {
echo OPTIONS='--selinux-enabled=false' > /etc/sysconfig/docker
echo "DOCKER_STORAGE_OPTIONS= -s overlay" > /etc/sysconfig/docker-storage
}
systemctl start docker
systemctl status docker
docker images
more reading:
https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
https://integratedcode.us/2016/08/30/storage-drivers-in-docker-a-deep-dive/
Was able to get it working and have mentioned it in
https://forums.docker.com/t/devmapper-space-issue/29786/3

Node state=down with TORQUE v6.1.0 on a Workstation

I was installing Torque 6.1.0 on a Ubuntu 16.04 Workstation, but the
installation doesn't seem to recognize how many cores and threads the
machine has. The only node I set up showed a status of "state=down" and
any job would trigger an error saying "not enough of the right type
of nodes". In fact, the workstation has 56 threads or 28 physical cores
on 2 processors, and I only want to use 54 threads or 27 physical cores
for the shared computing jobs. I realized that this might be related to the configuration of cgroup or NUMA starting from Torque V6.0 which I am not if I was doing the right thing while installing. I indeed had the cgroup enabled, but not sure if I also need to configure NUMA-aware function to be enabled as well. Below are some outputs of current configs. What should I do? Thanks.
$ pbsnodes
node1
state = down
power_state = Running
np = 54
ntype = cluster
mom_service_port = 15002
mom_manager_port = 15003
total_sockets = 0
total_numa_nodes = 0
total_cores = 0
total_threads = 0
dedicated_sockets = 0
dedicated_numa_nodes = 0
dedicated_cores = 0
dedicated_threads = 0
$ lssubsys -am
cpuset /sys/fs/cgroup/cpuset
cpu,cpuacct /sys/fs/cgroup/cpu,cpuacct
blkio /sys/fs/cgroup/blkio
memory /sys/fs/cgroup/memory
devices /sys/fs/cgroup/devices
freezer /sys/fs/cgroup/freezer
net_cls,net_prio /sys/fs/cgroup/net_cls,net_prio
perf_event /sys/fs/cgroup/perf_event
hugetlb /sys/fs/cgroup/hugetlb
pids /sys/fs/cgroup/pids
There is also a fishy part that it seems the server cannot see the node I defined already on the server's configure file. This can be seen on the /var/spool/torque/server_logs log file:
12/27/2016 15:48:33.147;01;PBS_Server.2692;Svr;PBS_Server;LOG_ERROR::get_node_from_str, Node node1 is reporting on node NapaValley, which pbs_server doesn't know about
12/27/2016 15:49:18.232;01;PBS_Server.2692;Svr;PBS_Server;LOG_ERROR::get_node_from_str, Node node1 is reporting on node NapaValley, which pbs_server doesn't know about
12/27/2016 15:49:25.491;08;PBS_Server.2696;Job;0.NapaValley;Job deleted at request of cquic#localhost
12/27/2016 15:49:27.023;08;PBS_Server.2657;Job;0.NapaValley;on_job_exit valid pjob: 0.NapaValley (substate=59)
12/27/2016 15:49:32.996;256;PBS_Server.2657;Job;0.NapaValley;dequeuing from batch, state COMPLETE
12/27/2016 15:49:59.722;256;PBS_Server.2696;Job;1.NapaValley;enqueuing into batch, state 1 hop 1
12/27/2016 15:49:59.722;08;PBS_Server.2696;Job;perform_commit_work;job_id: 1.NapaValley
12/27/2016 15:49:59.722;02;PBS_Server.2696;node;close_conn;Closing connection 9 and calling its accompanying function on close
12/27/2016 15:49:59.795;64;PBS_Server.2692;Req;node_spec;job allocation request exceeds currently available cluster nodes, 1 requested, 0 available
12/27/2016 15:49:59.796;08;PBS_Server.2692;Job;1.NapaValley;Job Modified at request of root#localhost
12/27/2016 15:50:03.312;01;PBS_Server.2696;Svr;PBS_Server;LOG_ERROR::get_node_from_str, Node node1 is reporting on node NapaValley, which pbs_server doesn't know about
On my /etc/hosts, I have
127.0.0.1 localhost node1
127.0.0.1 NapaValley
PS: I have tried to mount cpu and other modules to /var/spool/torque/cgroup directories, but lssubsys -am still showed the same information as above. I assume they should have been mounted?
A node will report to the server with a name returned by the gethostbyname call. Based on the log lines you posted, the server and the node don't agree on that name. You can have pbs_mom return a different name by starting it with the -H option:
http://docs.adaptivecomputing.com/torque/6-0-2/adminGuide/help.htm#topics/torque/commands/pbs_mom.htm#-h
"-H hostname Sets the MOM's hostname. This can be useful on multi-homed networks."
This is equivalent to setting $mom_host node1 in /var/spool/torque/mom_priv/config.

How do I access a USB drive on a OSX host from inside a docker container?

I have an application that I eventually want to run on a cloud computing service (e.g., such as AWS or Google Cloud) packaged inside a docker image. The reason the application will need to run in the cloud is because it's designed to process large data files, but before I actually deploy, I'd like to test it first on a local laptop, using a single large data file that I've stored (for test and development purposes) on an external USB drive.
My development machine is an OSX laptop, and I'm using a recent version of docker:
stachyra> uname -a
Darwin Andrews-MacBook-Pro-76.local 14.5.0 Darwin Kernel Version 14.5.0: Tue Sep 1 21:23:09 PDT 2015; root:xnu-2782.50.1~1/RELEASE_X86_64 x86_64
stachyra> docker --version
Docker version 1.10.2, build c3959b1
OSX has mounted my external USB drive, device /dev/disk2s2, as /Volumes/MGR DATA:
stachyra> df
Filesystem 512-blocks Used Available Capacity iused ifree %iused Mounted on
/dev/disk1 974770480 435721376 538537104 45% 54529170 67317138 45% /
devfs 375 375 0 100% 650 0 100% /dev
map -hosts 0 0 0 100% 0 0 100% /net
map auto_home 0 0 0 100% 0 0 100% /home
/dev/disk2s2 3906291632 3869523640 36767992 100% 483690453 4595999 99% /Volumes/MGR DATA
/dev/disk3s1 196608 193160 3448 99% 24143 431 98% /Volumes/VirtualBox
stachyra> diskutil list
/dev/disk0
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *500.3 GB disk0
1: EFI EFI 209.7 MB disk0s1
2: Apple_CoreStorage 499.4 GB disk0s2
3: Apple_Boot Recovery HD 650.0 MB disk0s3
/dev/disk1
#: TYPE NAME SIZE IDENTIFIER
0: Apple_HFS Macintosh HD *499.1 GB disk1
Logical Volume on disk0s2
DB70B91A-3B57-4C82-A758-C4BDEA4160FD
Unlocked Encrypted
/dev/disk2
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *2.0 TB disk2
1: EFI EFI 209.7 MB disk2s1
2: Apple_HFS MGR DATA 2.0 TB disk2s2
/dev/disk3
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *100.7 MB disk3
1: Apple_HFS VirtualBox 100.7 MB disk3s1
and it should also be noted, the drive has several directories and data which are visible inside it, at least when viewed directly through OSX:
stachyra> ls -l /Volumes/MGR\ DATA
total 0
drwxr-xr-x 6 stachyra staff 204 Apr 14 2015 1000genomes
drwxr-xr-x 5 stachyra staff 170 Oct 12 17:41 GIAB
drwxr-xr-x 4 stachyra staff 136 Apr 28 2015 genome_browser_tracks
drwxr-xr-x 24 stachyra staff 816 Oct 6 14:00 mitty
I have tried to follow the advice from this question, which describes how to mount a USB drive in docker when docker is running within a linux host. But my local laptop is OSX, not linux, so it doesn't seem to work.
Explicitly, when attempting to follow the advice of the accepted answer, I obtain the following result:
stachyra> docker run -i -t --privileged -v /dev/disk2s2:/dev/foo ubuntu bash
root#8da7b492a707:/# uname -a
Linux 8da7b492a707 4.1.18-boot2docker #1 SMP Sat Feb 20 08:24:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root#8da7b492a707:/# ls -l /dev/foo
total 0
root#8da7b492a707:/#
Based upon the response, one can see that docker does indeed launch a linux container correctly, and it also creates a volume /dev/foo inside of the container as requested, but the actual contents of the USB drive are not accessible via that location--the ls -l command claims there are no files or directories there.
I also tried the second method described in an alternate response to the same question, and that fails even worse:
stachyra> docker run -i -t --device=/dev/disk2s2 ubuntu bash
docker: Error response from daemon: error gathering device information while adding custom device "/dev/disk2s2": not a device node.
stachyra>
I have found another discussion thread on stackoverflow which suggests that raw USB access is handled quite differently in OSX than in linux, which I suspect is probably the reason why both of the above attempts at USB access are failing.
But, what should I actually do about it? That is to say, what is the correct sequence of actions or commands to allow docker to access a USB device mounted on an OSX host, rather than linux?
I was finally able to access my USB drive from /var/media inside my container by using the machine-diskutil.sh script mentioned in warmoverflow's comment like so
machine-diskutil.sh mount my-machine-name /Volumes/my-usb-drive
and then starting the container like so
docker run -v /Volumes/my-usb-drive:/var/media -it my/image:latest bash
Because I had tried to add /Volumes/my-usb-drive as a shared folder manually in VirtualBox, I first got this error.
Error: The shared folder /Volumes/Seagate already exists on the
docker machine, please unmount it first.
So I removed it manually and re-ran the machine-diskutil.sh mount command without any problems. Great stuff!
As per #pgayvallet comment on GitHub:
As the daemon runs inside a VM in Docker Desktop, it is not possible to actually share a mac host device with the container inside the VM, and this will most definitely never be possible.

Resources