Docker run error: "Thin Pool has free data blocks which is less than minimum required" - linux

We are trying to run a docker in a way that used to work, but now we get a "Thin Pool lack of space" error:
docker run --privileged -d --net=host --name=fat-redis -v /fat/deploy:/fat/deploy -v /fat/fat-redis/var/log:/var/log -v /home:/home fat-local.indy.xiolab.myserv.com/fat-redis:latest /fat/deploy/docker/fat-redis/fat_start_docker_inner.sh
docker: Error response from daemon: devmapper: Thin Pool has 486 free data blocks which is less than minimum required 163840 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior.
See 'docker run --help'.
What does this error mean?
We tried 'docker rmi' and the advise from here, but all in vain.
Any ideas?
Thank you

Running with data/metadata over loopback devices was the default on older versions of docker. There are problems with this, and newer versions have changed this default. If docker was configured this way, then normal updates (e.g. through rpm/apt) don't change the configuration, which is why a full reinstall was required to fix.
Here's an article with instructions on how to configure older versions to not use loopback devices:
http://www.projectatomic.io/blog/2015/06/notes-on-fedora-centos-and-docker-storage-drivers/

You don't have to reinstall Docker. Rather, you can clean up all the containers, images, volumes, etc. under /var/lib/docker directory.
Those images could be pulled up from your Docker repositories again. (This is assuming you only use this Docker host for building Docker images.)

My issue was unrelated to the loopback device problem, but was generating the same error condition. "docker images -a" showed a number of name=none tag=none images taking up space. These images were not "dangling"; they were referenced by a current active image, and could not be deleted.
My solution was to run "docker save" and write the active image to a tar file, delete the active image (which deleted all the child images), then run "docker load -i" from the tar file and create a single new image. No more errors related to Thin Pool space.
Reinstalling docker would have corrected it, simply because reinstalling docker does clear out all images, but it would have begun building up again and then I would have re-encountered this issue in the future.

Use the following to cleanup unnecessary images.
docker image prune -a --force --filter "until=240h"
Refer to this document for more details: https://docs.docker.com/engine/reference/commandline/image_prune/

TL;DR
Sometimes you just need more space. Increase the data file with the truncate command.
Explanation:
The reason that a reinstall or a purge of all your images works is that you have a "ramdisk" that docker uses as a space to build the images, but it's not purged after the image is running. If you are running several different images, you can fill up the scratch disk and the "newer" image doesn't have enough space to run in.
The docker system prune command won't work because that space is legitimately consumed. You need to increase the size of the scratch file.
Make sure you have extra physical space on disk
df
Figure out the size of your data file
docker info |grep 'Data Space'
Find the location of your data file
docker info |grep 'loop file'
Increase the size of your data file (+50G or whatever)
sudo truncate -s 150G /var/lib/docker/devicemapper/devicemapper/data
Restart the machine. The guide talks about a bunch of commands to "cascade" the resize through the layer, but a restart did them automatically
sudo reboot
References:
{all the SO posts that complained about the loopback driver being outdated}
https://docs.docker.com/storage/storagedriver/device-mapper-driver/#use-operating-system-utilities

Turned out that re-installing docker did the trick.
Use the following link: https://docs.docker.com/engine/installation/linux/centos/
Cheers

Related

How can I delete partially built docker images?

I have three parallel jenkins builds. If one of the builds (the Linux build) fails, all the rest builds (two Windows builds that take much longer time and run on one Windows jenkins node) are interrupted.
If they are interrupted after their docker build -t mytag:myname ... is finished to execute, everything is alright and they are deleted in jenkins' post always section by docker rmi mytag:myname.
However, if the Windows builds are not finished, their images are left unnamed/untagged and I end up with them left undeleted after the failed jenkins job. And I am afraid of eating up all the storage on the Windows jenkins node.
I also must mention that I cannot run parallel docker prune commands in jenkins on the Windows node, because I get error: Error response from daemon: a prune operation is already running, as these commands end up being executed at once on the one Windows jenkins node.
The only idea I have: to have weekly cron on Linux/Windows jenkins nodes doing docker system prune -a every Sunday.
I would really appreciate any ideas on how these partially built images could be eliminated on the Windows node.
To remove untagged/dangling images you can try the command:
docker rmi $(docker image ls -q -f dangling=true )
A quick explanation what this command should do:
docker rmi - remove images with these IDs.
$(...) - execute a subcommand.
docker image ls - list all images.
-q - only show the IDs of the images.
-f dangling=true - -f is a filter, and we filter for dangling/untagged images
As a whole the subcommand gives you all the IDs of images unused and untagged, whereas the main command removes all the images with the corresponding IDs.
Sources:
https://docs.docker.com/engine/reference/commandline/image_ls/
https://docs.docker.com/engine/reference/commandline/rmi/
https://levelup.gitconnected.com/docker-remove-dangling-aka-none-images-8593ef60deda

Segmentation fault in Docker build command with NPM install

I'm experiencing segmentation fault in Docker build command in EC2 instance having Ubuntu. When it comes to RUN command with NPM install it will give a segmentation fault. I also looked at storage space as I read about segmentation signal "A process that tries to read and write memory its not allowed to access. The kernal will normally terminate the process".
My Dockerfile is looking like this.
FROM node:12.18.1
WORKDIR /deployment-001/service
COPY package.json .
RUN npm install
COPY . .
EXPOSE 1883
CMD ["node","service.js"]
I attaching the storage space screenshot which clearly says it has a lot of memory to write.
Please correct me if I'm wrong anywhere.
I also made a screenshot of docker build command which is giving me segmentation fault.
Please help me if I'm wrong about linux storage space or if I'm missing something about segmentation fault.
NOTE: At first I have only 8GB of storage space then I expanded the EBS volume.
Thanks
t2.micro only have 1 GB RAM, which is not enough for NPM build. Change to t2.small. This is a common issue for people using NodeJS but wants to save cost by using t2.micro. Spend some money so that you will not face this kind of issue in the future
I don't know why it worked but it was solved in this way to include --no-cache option in docker build command.
I just used "--no-cache" option in docker build command, it didn't show segmentation error anymore.

Trouble converting Docker to Singularity: "Function not implemented" in Singularity, but works fine in Docker

I have an Ubuntu docker container that works perfectly fine as is. I have a custom binary inside that executes and returns as expected. Because of security reasons, I cannot use docker for automated testing. I created a docker archive and then I load a singularity container from this docker archive. The binary that I need to run fails with the following error:
MyBinary::BinaryNameSpace::BinaryFunction[FATAL]: boost::filesystem::status: Function not implemented: "/var/tmp/username"
When I run $ldd <binary_path>, I see that a boost filesystem binary was linked. I am not sure why the binary is unable to find the status function...
So far, I have used a tool called ermine to turn the dynamically linked binary into a static binary
I still got the same error, which I found very strange.
Any suggestions on directions to look next are very appreciated. Thank you.
Both /var/tmp and /tmp are silently automounted by default. If anything was added to /var/tmp during singularity build or in the source docker image, it will be hidden when the host's /var/tmp is mounted over it.
You can disable the automounts individually when you run a singularity command, which is probably what you want to do first to check that it is the source of the problem (e.g., singularity run --no-mount tmp ...). I'd also recommend using --writable-tmpfs or manually mounting -B /tmp to make sure that there is somewhere writable for any temp files. You are likely to get an error about a read-only filesystem if not.
The host OS environment can also cause problems in unexpected ways that are hard to debug. I recommend using --cleanenv as a general practice to minimize this.
The culprit was an outdated Linux kernel. The containers still use the host's kernel.
On Docker, I was using Kernel 5.4.x and the computer that runs the singularity container runs 3.10.x
There are instructions in the binary which are not supported on 3.10.x
There is no fix for now except running the automated tests on a different computer with a newer kernel.

Yarn is slow and freezes when run via docker exec

I recently started using docker (desktop version for Windows) for my node project development. I have a docker-compose file with volume configuration to share the project source files between my host machine and docker container.
When I need to install a new mode module, I can't do that on my host machine, of course, because it's Windows and docker is Linux or something, so I run docker exec -it my-service bash to "get into" the docker container and then run yarn add something from inside it. The problem is - yarn runs extremely slow and freezes almost all of the time. The docker container then becomes unresponsive, I cannot cancel the yarn command or stop the container using docker-compose stop. The only way I've found to recover is to restart the whole docker engine. So then, to finally install the new module, after docker engine restarts, I delete the node_modules folder and do the same steps again. This time it's still extremely slow, but it doesn't freeze somehow and actually installs the new module. But after some time, when I need to do that again, it freezes again and I have to delete node_modules again...
I would like to find the reasons why the yarn command is so slow and why it freezes.
I'm new to docker, so maybe my workflow is not correct.
I tried increasing RAM limit for docker engine from 2 GB to 8 GB and CPUs limit from 1 to 8, but it had absolutely no effect on the yarn command behavior.
My project was using file watching with chokidar, so I thought maybe that could cause the problem, but disabling it had no effect either.
I also thought the problem could be the file sharing mechanism between host machine (Windows) and docker container, but if it is the case, I do not know how to fix it. I suppose I then should somehow separate node_modules from the source directory and make them private to docker container, so that they are not shared with host machine.
This is quite a severe problem, as it slows the development down a lot. Please share any of your ideas about what could be wrong. I would even consider changing my development environment to Linux if the problem was caused by the file sharing mechanism between Windows and docker container.

ng build returns fatal out of memory exception in Docker

I'm trying to build the frontend of a web application in a Node.js Docker container. As I'm on a Windows PC, I'm very limited in my base Images. I chose this one, as it's the only one on DockerHub with a decent number of downloads. As the application is meant to run in Azure, I'm also limited to Windowsservercore 2016. When I run the following Dockerfile, I get the error message below (on my host system the build runs fine btw):
FROM stefanscherer/node-windows:10.15.3-windowsservercore-2016
WORKDIR /app
RUN npm install -g #angular/cli#6.2.4
COPY . ./
RUN ng build
#
# Fatal error in , line 0
# API fatal error handler returned after process out of memory on the background thread
#
#
#
#FailureMessage Object: 000000E37E3FA6D0
I tried increasing the memory available to the build process with --max_old_space up to 16GB (the entire RAM of my laptop) but that didn't help. I also contacted the author of the base image to find out if that's the issue but as this doesn't seem to be reproducable with a smaller example application, that wasn't very fruitful either. I'm working on this issue for a week now and I'm seriously out of ideas what could be the reason. So I hope to get a new impulse from here. At least a dircetion I could investigate in.
What I also tried was getting Node.js and Angular installed on a Windowsservercore base image. If someone has an idea how to do that, it could be the solution.
EDIT: I noticed that the error message is the only output I get from the build process, it doesn't even get to try building the modules. Maybe that means something...
Alright, I figured it out. Although the official Docker documentation states, that Docker has unlimited access to resources, it seems that you need to use the -m option when your build process exceeds a certain amount of memory.
Edit: This question seems to be getting some views so maybe I should clarify this answer a bit. The root of the problem seems to be that under Windows, Docker runs inside a Hyper-V VM. So when the documentation talks about "unlimited access to resources", it doesn't mean your PC's resources, but instead the resources of that VM.

Resources