I came across this blog: using go as a scripting language and tried to create a custom image that I can use to run golang scripts i.e.
FROM golang:1.15
RUN go get github.com/erning/gorun
RUN mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
RUN echo ':golang:E::go::/go/bin/gorun:OC' | tee /proc/sys/fs/binfmt_misc/register
It fails with error:
mount: /proc/sys/fs/binfmt_misc: permission denied.
ERROR: Service 'go_saga' failed to build : The command '/bin/sh -c mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc' returned a non-zero code: 32
It's readonly file system so can't change the permissions as well. The task I'm trying to achieve here is well documented here. Please help me with following questions:
Is that even possible i.e. mount /proc/sys/fs/binfmt_misc and write to the file: /proc/sys/fs/binfmt_misc/register ?
If Yes, how to do that ?
I guess, it would be great, if we could run golang scripts in the container.
First a quick disclaimer that I haven't done this binfmt trick to run go scripts. I suppose it might work, but I just use go run when I want to run something on the fly.
There's a lot to unpack in this. Container isolation runs an application with a shared kernel in an isolated environment. The namespaces, cgroups, and security settings are designed to prevent one container from impacting other containers or the host.
Why is that important? Because /proc/sys/fs/binfmt_misc is interacting with the kernel, and pushing a change to that would be considered a container escape since you're modifying the underlying host.
The next thing to cover is building an image vs running a container. When you build an image with the Dockerfile, you are defining the image filesystem and some metadata (labels, entrypoint, exposed ports, etc). Each RUN command executes that command inside a temporary container, based on the previous step's result, and when the command finishes it captures the changes to the container filesystem. When you mount another filesystem, that doesn't change the underlying container filesystem, so even if you could, the mount command would be a noop during the image build.
So if this is possible, you'll need to do it inside the container rather than during build time, that container will need to be privileged since doing things like mounting filesystems and modifying /proc requires access not normally given to containers, and you'll be modifying the host kernel in the process. You'd need to make the container entrypoint run the mount and register the binfmt_misc entry, and figure out what to do if the entry is already setup/registered, but possibly to a different directory in another container.
As an aside, when dealing with binfmt_misc and containers, the F flag is very important, though in your use case it's important that you don't have it. Typically you need the F flag so the binary is found on the host filesystem rather than searched for within the container filesystem namespace. The typical use case of binfmt_misc and containers is configuring the host to be able to run containers for different architectures, e.g. Docker Desktop can run amd64, arm64, and a bunch of other platforms today using this.
In the end, if you want to run a container as a one off to run a go command as a script, I'd skip the binfmt misc trick and make an entrypoint that does a go run instead. But if you're using the container for longer run processes where you want to periodically run a go file as a script, you'll need to do that in the container, and as a privileged container that has the ability to escape to the host.
Related
I'm running Jenkins inside a Docker container. I wonder if it's ok for the Jenkins container to also be a Docker host? What I'm thinking about is to start a new docker container for each integration test build from inside Jenkins (to start databases, message brokers etc). The containers should thus be shutdown after the integration tests are completed. Is there a reason to avoid running docker containers from inside another docker container in this way?
Running Docker inside Docker (a.k.a. dind), while possible, should be avoided, if at all possible. (Source provided below.) Instead, you want to set up a way for your main container to produce and communicate with sibling containers.
Jérôme Petazzoni — the author of the feature that made it possible for Docker to run inside a Docker container — actually wrote a blog post saying not to do it. The use case he describes matches the OP's exact use case of a CI Docker container that needs to run jobs inside other Docker containers.
Petazzoni lists two reasons why dind is troublesome:
It does not cooperate well with Linux Security Modules (LSM).
It creates a mismatch in file systems that creates problems for the containers created inside parent containers.
From that blog post, he describes the following alternative,
[The] simplest way is to just expose the Docker socket to your CI container, by bind-mounting it with the -v flag.
Simply put, when you start your CI container (Jenkins or other), instead of hacking something together with Docker-in-Docker, start it with:
docker run -v /var/run/docker.sock:/var/run/docker.sock ...
Now this container will have access to the Docker socket, and will therefore be able to start containers. Except that instead of starting "child" containers, it will start "sibling" containers.
I answered a similar question before on how to run a Docker container inside Docker.
To run docker inside docker is definitely possible. The main thing is that you run the outer container with extra privileges (starting with --privileged=true) and then install docker in that container.
Check this blog post for more info: Docker-in-Docker.
One potential use case for this is described in this entry. The blog describes how to build docker containers within a Jenkins docker container.
However, Docker inside Docker it is not the recommended approach to solve this type of problems. Instead, the recommended approach is to create "sibling" containers as described in this post
So, running Docker inside Docker was by many considered as a good type of solution for this type of problems. Now, the trend is to use "sibling" containers instead. See the answer by #predmijat on this page for more info.
It's OK to run Docker-in-Docker (DinD) and in fact Docker (the company) has an official DinD image for this.
The caveat however is that it requires a privileged container, which depending on your security needs may not be a viable alternative.
The alternative solution of running Docker using sibling containers (aka Docker-out-of-Docker or DooD) does not require a privileged container, but has a few drawbacks that stem from the fact that you are launching the container from within a context that is different from that one in which it's running (i.e., you launch the container from within a container, yet it's running at the host's level, not inside the container).
I wrote a blog describing the pros/cons of DinD vs DooD here.
Having said this, Nestybox (a startup I just founded) is working on a solution that runs true Docker-in-Docker securely (without using privileged containers). You can check it out at www.nestybox.com.
Yes, we can run docker in docker, we'll need to attach the unix socket /var/run/docker.sock on which the docker daemon listens by default as volume to the parent docker using -v /var/run/docker.sock:/var/run/docker.sock.
Sometimes, permissions issues may arise for docker daemon socket for which you can write sudo chmod 757 /var/run/docker.sock.
And also it would require to run the docker in privileged mode, so the commands would be:
sudo chmod 757 /var/run/docker.sock
docker run --privileged=true -v /var/run/docker.sock:/var/run/docker.sock -it ...
I was trying my best to run containers within containers just like you for the past few days. Wasted many hours. So far most of the people advise me to do stuff like using the docker's DIND image which is not applicable for my case, as I need the main container to be Ubuntu OS, or to run some privilege command and map the daemon socket into container. (Which never ever works for me)
The solution I found was to use Nestybox on my Ubuntu 20.04 system and it works best. Its also extremely simple to execute, provided your local system is ubuntu (which they support best), as the container runtime are specifically deigned for such application. It also has the most flexible options. The free edition of Nestybox is perhaps the best method as of Nov 2022. Highly recommends you to try it without bothering all the tedious setup other people suggest. They have many pre-constructed solutions to address such specific needs with a simple command line.
The Nestybox provide special runtime environment for newly created docker container, they also provides some ubuntu/common OS images with docker and systemd in built.
Their goal is to make the main container function exactly the same as a virtual machine securely. You can literally ssh into your ubuntu main container as well without the ability to access anything in the main machine. From your main container you may create all kinds of containers like a normal local system does. That systemd is very important for you to setup docker conveniently inside the container.
One simple common command to execute sysbox:
dock run --runtime=sysbox-runc -it any_image
If you think thats what you are looking for, you can find out more at their github:
https://github.com/nestybox/sysbox
Quicklink to instruction on how to deploy a simple sysbox runtime environment container: https://github.com/nestybox/sysbox/blob/master/docs/quickstart/README.md
I want to start the following docker container and have terminal access to it:
docker run -it docker:5000/builds/build-lnx64-centos7:latest /bin/bash
The problem is that inside the terminal I can not find any of the files in my file system. No ~/Desktop and similar directories.
Question: how to access the file system of my local PC from within the docker container?
By default, containers cannot see the file system of their host.
If you want to achieve this, you will have to explicitly "mount" whatever directories you want to see using the -v flag, like this:
docker run -v ~/Desktop:/host-desktop -it docker:5000/builds/build-lnx64-centos7:latest /bin/bash
If you run that command, you will see the contents of your desktop in the container's file system, at /host-desktop.
You really would not want your container's to be able to see the entire host file system. That would be dangerous, especially if the container has write permission. You should always only "mount" the exact files/directories you want the container to access.
For the most part, any project I have worked on that uses docker does "volume mounting" so that the container can write files and the developer can easily access them on the host (e.g. selenium tests taking screenshots) or so the developer can edit source code and the container will see the update and hot-reload (e.g. nodejs development). When doing the latter (hot-reload example), it is usually wise to mount in read-only mode.
See the docs for more details: https://docs.docker.com/engine/reference/commandline/run/#mount-volume--v---read-only
My current development environment allows for automatic code reload whenever changing a file (i.e nodemon / webpack). However I am setting up a kubernetes (minikube) environment so that I can quickly open 3-4 related services at once.
Everything is working fine, but it is not currently doing the automatic code reload. I tried mounting the volume but there is some conflict with the way docker and virtualbox handles files such that the conflict leads to file changes from the host not reflected in docker container. (That's not the first link I have that appears related to this problem, it's just the first I found while googling it on another day)...
Anyways, long story short, ppl have trouble getting live reload done in development. I've found the problem literred throughout the interweb with very few solutions. The best solution I would say I found so far is This person used tar from the host to sync folders.
However I would like a solution from the container. The reason is that I want to run the script from the container so that the developer doesn't have to run some script on his host computer every time he starts developing in a particular repo.
In order to do this however I need to run rsync from the container to the host machine. And I'm having a surprising lot of trouble figuring out how to write the syntax for that.
Let's pretend my app exists in my container and host respectively as:
/workspace/app # location in container
/Users/terence/workspace/app # location in host computer
How do I rsync from the container to the host? I've tried using the 172.17.0.17 and 127.0.0.1 to no avail. Not entirely sure if there is a way to do it?
examples I tried:
rsync -av 172.17.0.17:Users/terence/workspace/app /workspace/app
rsync -av 127.0.0.1:Users/terence/workspace/app /workspace/app
If you're running the rsync from the host (not inside the container), you could use docker cp instead:
e.g., docker cp containerName:/workspace/app Users/terence/workspace/app
Could you clarify:
1. are you running the rsync from the host or from inside the container?
If it's from inside the container it'll depend a lot on the --network the container is attached to (i.e., bridged or host) and also the mounted volumes (i.e., when you started up the container did you use -v flag?)
Update: For rsync to work from within the container you need to expose the host's dir to the container.
As you think of a solution, keep this in mind: host dir as a data volume
Note: The host directory is, by its nature, host-dependent. For this reason, you can’t mount a host directory from Dockerfile, the VOLUME instruction does not support passing a host-dir, because built images should be portable. A host directory wouldn’t be available on all potential hosts.
Running docker on the Mac, with a centos image, I see mounted volumes taking on the ownership of the centos (internal) user, while on the filesystem the ownership is mine (mdf:mdf).
Using the same centos image on RHEL 7, I see the volumes mounted, but inside, in centos, the home dir and the files all show my uid (1055).
I can do a recursive chown to, e.g., insideguy:insideguy, and all looks right. But back in the host filesystem, the ownerships have changed to some other person in the registry that has the same uid as was selected for insideguy(1001) when useradd was executed.
Is there some fundamental limitation in docker for Linux that makes this happen?
As another side effect, in our cluster one cannot chown on a mounted filesystem, even with sudo privileges; only on a local filesystem. So the desire to keep the docker home directories in, e.g., ~/dockerhome, fails because docker seems to be trying (and failing) to perform some chowns (not described in the Dockerfile or the start script, so assumed to be part of the --volume treatment). Placed in /var or /opt with appropriate ownerships, all goes well.
Any idea what's different between the two docker hosts?
Specifics: OSX 10.11.6; docker v1.12.1 on mac, v1.12.2 on RHEL 7; centos 7
There is a fundamental limitation to Docker on OS X that makes this happen: that is the fact that Docker only runs on Linux.
When running Docker on other platforms, this requires first setting up a Linux VM (historically through VirtualBox, although more recently other options are available) and then running Docker inside that VM.
Because Docker is running natively on Linux, it is sharing filesystems directly with the host when you use something like docker run -v /host/path:/container/path. So if inside the container you run chown userA somefile and user A has userid 1001, and on your host that user id belongs to userB, then of course when you look at the files on the host they will appear to be owned by userB. There's no magic here; this is just how Unix file permissions work. You get the same behavior if, say, you were to move a disk or NFS filesystem from one host to another that had conflicting entries in their local /etc/passwd files.
Most Docker containers are running as root (or at least, not as your local user). This means that any files created by a process in Docker will typically not be owned by you, which can of course cause problems if you are trying to access a filesystem that does not permit this sort of access. Your choices when using Docker are pretty the same choices you have when not using Docker: either ensure that you are running containers as your own user id -- which may not be possible, since many images are built assuming they will be running as root -- or arrange to store files somewhere else.
This is one of the reasons why many people discourage the use of host volume mounts, because it can lead to this sort of confusion (and also because when interacting with a remote Docker API, the remote Docker daemon doesn't have any access to your local host filesystem).
With Docker for Mac, there is some magic file sharing that goes on to expose your local filesystem to the Linux VM (for example, with VirtualBox, Docker may use the shared folders feature). This translation layer is probably the cause of the behavior you've noted on OS X with respect to file ownership.
I am experimenting with Docker and understanding concepts around use of volumes. I have a tomcat app which writes files to a particular volume.
I write a Dockerfile with ENTRYPOINT of "dosomething.sh"
The issue I have with entrypoint script is ..
In the "dosomething.sh", I could potentially have a malicious code to delete all files on the volume !!!
Is there a way to guard against it, especially because, I was planning on sharing this dockerfile and script with my dev team too and the care i have to take for production role out appears scary !
One thought is not to have an "ENTRYPOINT" at all for all the containers that have volumes.
Experienced folks,please advise on how you deal with this...
If you are using data volume container to isolate your volume, such container never run: they are created only (docker create).
That means you need to mount that data volume container into other containers for them to access that volume.
That mitigates a bit the dangerous entrypoint: a simple docker run would have access to nothing, since no -v mounting volume option would have been set.
Another approach is to at least have the script declared as CMD, not ENTRYPOINT (and for the ENTRYPOINT as [ "/bin/sh", "-c" ]. That way, it is easier to docker run with an alternative command (passed as parameter, overriding CMD), instead of having to always execute the script just because it is an ENTRYPOINT.