File permissions for mounted volumes in image processing pipelines?

File permissions for mounted volumes in image processing pipelines? - linux

We're using docker to containerize some image processing pipelines to make sharing them with collaborators easier.
The current method we're using is to mount an "inputs" directory (which contains an image. i.e. a single jpg) and an "outputs" directory (which contains the processed data. i.e. maybe a segmentation of the input image). The problem we're having is we run docker with sudo, and after the processing is complete, the files in the outputs directory have root permissions.
Is there a standard or preferred way to set the files in mounted volumes to have the permissions of the calling user?

Perhaps you can use the --user flag in docker run
e.g.
docker run --user $UID [other flags...] image [cmd]
Alternatively the following might work (untested)
In Dockerfile
ENTRYPOINT "su $USERID -c"
Followed by: `
docker run -e USERID=$UID [other flags...] image [cmd]

Try setting the user in your Dockerfile so that when the container is started it uses 'tomcat' for example.
# example
USER tomcat

Related

Docker mount files to my local system while running the container

I am using an image academind/node-example-1 which is a simple node image. You can check it https://hub.docker.com/r/academind/node-example-1 here. What I want while I run the image I want to get the same folder & file structure that is there in the Image. I know that can be done via volume. When I use like
docker run -d --rm --name node-test -p 5000:80 academind/node-example-1
Everything is proper But I want to get the codebase while running, so I tried like
docker run -d --rm --name node-test -p 5000:80 -v /Users/souravbanerjee/Documents/node-docker:node-example-1 academind/node-example-1
Here node-docker is my local folder where I expect the code to be. It runs but not getting the files in the local machine, I'm in doubt here where the source_path:destination_path. Please correct me to please tell me where I'm wrong, or what to do, or my entire thinking is going in the wrong direction or not.
Thanks.

If you read the official doc, you'll see that the first part of the : should be the path somewhere in the host machine (which you're doing), while the later part should match the path "inside" the container (instead you're using image name). Assuming /app being the path (I've taken that course by myself and this is the path AFAIR), it should be:
docker run -d --rm --name node-test -p 5000:80 -v /Users/souravbanerjee/Documents/node-docker:/app academind/node-example-1

I think the correct syntax is to enter the drive information in the volume mapping.
Eg. Users/xxxx/:some/drive/location
However, that would map your empty drive at xxxx over the top of 'location' folder. Thus deleting the existing files in the container.
If you are interested in seeing the contents of the files in the container, you should consider using 'Docker CP' command.
People often use volume mounts to push data (i.e. persistent database files) into a container.
Alternatively, writing log files to the volume mounted location inside the application container. Then those files are then reflected on your local drive

You can copy the files to the current host directory using the command
docker cp node-test:/app .
when the container is running

How do I create a directory in a Docker container that won't start?

I have a Docker container (not image) that crashes when I try to start it. The Docker logs show that it is failing because and Apache2 conf file can't find a directory (/var/www/html/log/ - this is the result of me trying to get SSL setup and forgot to create this directory after I referenced it in the 000-default.conf file and restarted Apache).
How do I create this directory in the container without having to start the container itself?

You have 4.5 options that comes to my mind:
You can rebuild the image and set up the directory while doing it.
You can attach a volume while starting the image, but in this case your changes will remain in your disk and not in your container.
You can run the image overriding the entry point with --entrypoint="bash" or something. You need to do it with -ti flag so that it begins in interactive mode. Then make your changes and run docker commit -p <container> <image:tag> -p pauses container while commiting. I recommend this unless it absolutely needs to be running.
I am not sure if this one works so I give half point :P but if it does this would be the fastest option actually. You can start the container in interactive mode with docker start -i container which would attach a terminal. And if you have time until container exits or read that part of configuration, you can create the folder.
Ah finally, I have just remembered, you should be able to move files and folders from your file system to container using docker cp [container:]<source> [container:]<destination> even while container is not running.

In general, if you're using a base Docker image for Apache (for example, httpd/2.4/Dockerfile), it should already have "/var/www/html/log".
SUGGESTION 1: Please make sure you're starting with a "good" base image.
SUGGESTION 2: Add "mkdir -p /var/www/html/log" to your Dockerfile, and rebuild the image.
I'm not sure how you're using your image - what you want the image to contain besides Apache - but:
SUGGESTION 3: Google for a simple tutorial that matches your use case, and see what steps you might be "missing". For example: Dockerize your Laravel Application

Docker -- mounting a volume not behaving like regular mount

I am new to docker so I am certain I am doing something wrong. I am also not a php developer but that shouldn't matter in this case.
I am using a drupal docker image which has data at the /var/www/html directory.
I am attempting to overwrite this data with a drupal site from a local directory on the host system.
According to the docs this is the expected behavior
Mount a host directory as a data volume
In addition to creating a
volume using the -v flag you can also mount a directory from your
Docker engine’s host into a container.
$ docker run -d -P --name web -v /src/webapp:/webapp training/webapp
python app.py
This command mounts the host directory, /src/webapp,
into the container at /webapp. If the path /webapp already exists
inside the container’s image, the /src/webapp mount overlays but does
not remove the pre-existing content. Once the mount is removed, the
content is accessible again. This is consistent with the expected
behavior of the mount command.
However I am finding that the local drupal site files do not exist on the container. My complete workflow is as follows:
docker-compose.yml
drupal:
container_name: empower_drupal
build: ./build/drupal-local-codebase
ports:
- "8888:80"
- "8022:22"
- "443"
#volumes: THIS IS ALSO NOT WORKING
#- /home/sameh/empower-tap:/var/www/html
$ docker-compose up -d
# edit the container by snapshotting it
$ docker commit empower_drupal empower_drupal1
$ docker run -d -P --name empower_drupal2 -v /home/sameh/empower-tap:/var/ww/html empower_drupal1
# snapshot the container to examine it
$ docker commit 9cfeca48efd3 empower_drupal2
$ docker run -t -i empower_drupal2 /bin/bash
The empower_drupal2 container does not have the correct files from the /home/sameh/empower-tap directory.

Why this did not work
Here's what you did, with some annotations.
$ docker-compose up -d
Given your docker-compose.yml, with the volumes section commented out, at this point you have running container, but no volumes mounted.
# edit the container by snapshotting it
$ docker commit empower_drupal empower_drupal1
All you've really done here is made a copy of the image you had already, unless your container makes changes to itself on startup.
$ docker run -d -P --name empower_drupal2 -v /home/sameh/empower-tap:/var/ww/html empower_drupal1
Here you have run your new copy, mounted a volume. Ok, the files are available in this container now.
# snapshot the container to examine it
$ docker commit 9cfeca48efd3 empower_drupal2
I'm assuming here that you wanted to commit the contents of the volume into the image. That will not work. The commit documentation is clear about this point:
The commit operation will not include any data contained in volumes mounted inside the container.
$ docker run -t -i empower_drupal2 /bin/bash
So, as you found, when you run the image generated by commit, but without volume mounts, the files are not there.
Also, it is not clear in your docker-compose.yml example where the volumes: section was before it was commented out. Currently it seems to be on the left margin, which would not work. It would need to be at the same level as build: and ports: in order to work on your drupal service.
What to do instead
That depends on your goal.
Just copy the files from local
If you literally just want to populate the image with the files from your local system, you can do that in Dockerfile.
COPY local-dir/* /var/www/html
You mentioned that this copy can't work because the directory is not local. Unfortunately that cannot be solved easily with something like a symlink. Your best option is to copy the directory to the local context before building. Docker does not plan to change this behavior.
Override contents for development
A common scenario is you want to use your local directory for development, so that changes are reflected right away instead of doing a rebuild. But when not doing development, you want the files baked into the image.
In that case, start by telling Dockerfile to copy the files into the image, as above. That way an image build will contain them, volume mount or no.
Then, when you are doing development, use volumes: in docker-compose.yml, or the -v flag to docker run, to mount a volume. A volume mount will override whatever is baked into the image, so you will be using your local files. When you're done and the code is ready to go, just do an image build and your final files will be baked into the image for deployment.
Use a volume plus a commit
You can also do this in a slightly roundabout way by mounting the volume, copying the contents elswhere, then committing the result.
# start a container with the volume mounted somewhere
docker run -d -v /home/sameh/empower-tap:/var/www/html_temp [...etc...]
# copy the files elsewhere inside the container
docker exec <container-name> cp -r /var/www/html_temp /var/www/html
# commit the result
docker commit empower_drupal empower_drupal1
Then you should have your mounted volume files in the resulting image.

Mount data volume to docker with read&write permission

I want to mount a host data volume to docker. But the container should have read and write permission to it, meantime, any changes on the data volumes should not affect the data in host.
I can image a solution that mount several data volumes to single folder, one is read only another is read and write. But only this second '-v' works in my command,
docker run -ti --name build_cent1 -v /codebase/:/code:ro -v /temp:/code:rw centos6:1.0 bash

only this second '-v' works in my command,
That might be because both -v options attempt to mount host folders on the same container destination folder /code.
-v /codebase/:/code:ro
^^^^^
-v /temp:/code:rw
^^^^^
You could mount those host folders in two separate folders within /code.
As in:
-v /codebase/:/code/base:ro -v /temp:/code/temp:rw.

Normally in this case I think you ADD the folder to the Docker image, so that any container running it will have it in its (writeable) filesystem, but writes will go to a different layer.
You need to write a Dockerfile in the folder above the one you wish to use, which should look something like this:
FROM my/image
ADD codebase /codebase
Then you build the container using docker build -t some-name <path>. These steps could be added to the build scripts of your app (maybe you will find some plugin to help there). Then you can docker run some-name.
The downside is that there is one copy to do and the image creation, but should you launch many containers they will share the same copy of the layer in read-only and write their own modifications to independent layers above.

Got one answer from nixun in github.
you can simply use overlayfs to fix this:
mount -t overlay overlay \
-olowerdir=/codebase,upperdir=/temp,workdir=/workdir /codebase_new
docker run -ti --name build_cent1 -v /codebase_new:/code:rw centos6:1.0 bash
This solution has a good flexibility. Create image with share folder would be a solution, but it cannot update folder data easily.

This answer is not for docker users but it will help anyone who uses Lima to manage their containers.
I was stuck trying to solve the issue with limactl and lima nerdctl . I thought it is worth sharing the fix so that it may help anyone in the community who's using lima instead of docker:
By default Lima mounts volumes as read only. to be make them writeable by default do the following:
Edit the file and set write: true under mount section
$ vim ~/.lima/default/lima.yaml
then restart lima
limactl list #this lists all running vms
limactl stop default #or name of the machine
limactl start default #or name of the machine
you would still need to specify mount options exactly as with docker
lima nerdctl run -ti --name build_cent1 \
-v /codebase/:/code/base:ro \
-v /temp:/code/temp:rw \
centos6:1.0 bash
For more information about lima, please check this out

Exploring Docker container's file system

I've noticed with docker that I need to understand what's happening inside a container or what files exist in there. One example is downloading images from the docker index - you don't have a clue what the image contains so it's impossible to start the application.
What would be ideal is to be able to ssh into them or equivalent. Is there a tool to do this, or is my conceptualisation of docker wrong in thinking I should be able to do this.

Here are a couple different methods...
A) Use docker exec (easiest)
Docker version 1.3 or newer supports the command exec that behave similar to nsenter. This command can run new process in already running container (container must have PID 1 process running already). You can run /bin/bash to explore container state:
docker exec -t -i mycontainer /bin/bash
see Docker command line documentation
B) Use Snapshotting
You can evaluate container filesystem this way:
# find ID of your running container:
docker ps
# create image (snapshot) from container filesystem
docker commit 12345678904b5 mysnapshot
# explore this filesystem using bash (for example)
docker run -t -i mysnapshot /bin/bash
This way, you can evaluate filesystem of the running container in the precise time moment. Container is still running, no future changes are included.
You can later delete snapshot using (filesystem of the running container is not affected!):
docker rmi mysnapshot
C) Use ssh
If you need continuous access, you can install sshd to your container and run the sshd daemon:
docker run -d -p 22 mysnapshot /usr/sbin/sshd -D
# you need to find out which port to connect:
docker ps
This way, you can run your app using ssh (connect and execute what you want).
D) Use nsenter
Use nsenter, see Why you don't need to run SSHd in your Docker containers
The short version is: with nsenter, you can get a shell into an
existing container, even if that container doesn’t run SSH or any kind
of special-purpose daemon

UPDATE: EXPLORING!
This command should let you explore a running docker container:
docker exec -it name-of-container bash
The equivalent for this in docker-compose would be:
docker-compose exec web bash
(web is the name-of-service in this case and it has tty by default.)
Once you are inside do:
ls -lsa
or any other bash command like:
cd ..
This command should let you explore a docker image:
docker run --rm -it --entrypoint=/bin/bash name-of-image
once inside do:
ls -lsa
or any other bash command like:
cd ..
The -it stands for interactive... and tty.
This command should let you inspect a running docker container or image:
docker inspect name-of-container-or-image
You might want to do this and find out if there is any bash or sh in there. Look for entrypoint or cmd in the json return.
NOTE: This answer relies on commen tool being present, but if there is no bash shell or common tools like ls present you could first add one in a layer if you have access to the Dockerfile:
example for alpine:
RUN apk add --no-cache bash
Otherwise if you don't have access to the Dockerfile then just copy the files out of a newly created container and look trough them by doing:
docker create <image> # returns container ID the container is never started.
docker cp <container ID>:<source_path> <destination_path>
docker rm <container ID>
cd <destination_path> && ls -lsah
see docker exec documentation
see docker-compose exec documentation
see docker inspect documentation
see docker create documentation

In case your container is stopped or doesn't have a shell (e.g. hello-world mentioned in the installation guide, or non-alpine traefik), this is probably the only possible method of exploring the filesystem.
You may archive your container's filesystem into tar file:
docker export adoring_kowalevski > contents.tar
Or list the files:
docker export adoring_kowalevski | tar t
Do note, that depending on the image, it might take some time and disk space.

Before Container Creation :
If you to explore the structure of the image that is mounted inside the container you can do
sudo docker image save image_name > image.tar
tar -xvf image.tar
This would give you the visibility of all the layers of an image and its configuration which is present in json files.
After container creation :
For this there are already lot of answers above. my preferred way to do
this would be -
docker exec -t -i container /bin/bash

The most upvoted answer is working for me when the container is actually started, but when it isn't possible to run and you for example want to copy files from the container this has saved me before:
docker cp <container-name>:<path/inside/container> <path/on/host/>
Thanks to docker cp (link) you can copy directly from the container as it was any other part of your filesystem.
For example, recovering all files inside a container:
mkdir /tmp/container_temp
docker cp example_container:/ /tmp/container_temp/
Note that you don't need to specify that you want to copy recursively.

The file system of the container is in the data folder of docker, normally in /var/lib/docker. In order to start and inspect a running containers file system do the following:
hash=$(docker run busybox)
cd /var/lib/docker/aufs/mnt/$hash
And now the current working directory is the root of the container.

you can use dive to view the image content interactively with TUI
https://github.com/wagoodman/dive

Try using
docker exec -it <container-name> /bin/bash
There might be possibility that bash is not implemented. for that you can use
docker exec -it <container-name> sh

On Ubuntu 14.04 running Docker 1.3.1, I found the container root filesystem on the host machine in the following directory:
/var/lib/docker/devicemapper/mnt/<container id>/rootfs/
Full Docker version information:
Client version: 1.3.1
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): 4e9bbfa
OS/Arch (client): linux/amd64
Server version: 1.3.1
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): 4e9bbfa

In my case no shell was supported in container except sh. So, this worked like a charm
docker exec -it <container-name> sh

The most voted answer is good except if your container isn't an actual Linux system.
Many containers (especially the go based ones) don't have any standard binary (no /bin/bash or /bin/sh). In that case, you will need to access the actual containers file directly:
Works like a charm:
name=<name>
dockerId=$(docker inspect -f {{.Id}} $name)
mountId=$(cat /var/lib/docker/image/aufs/layerdb/mounts/$dockerId/mount-id)
cd /var/lib/docker/aufs/mnt/$mountId
Note: You need to run it as root.

I use another dirty trick that is aufs/devicemapper agnostic.
I look at the command that the container is running e.g. docker ps
and if it's an apache or java i just do the following:
sudo -s
cd /proc/$(pgrep java)/root/
and voilá you're inside the container.
Basically you can as root cd into /proc/<PID>/root/ folder as long as that process is run by the container. Beware symlinks will not make sense wile using that mode.

Only for LINUX
The most simple way that I use was using proc dir, the container must be running in order to inspect the docker container files.
Find out the process id (PID) of the container and store it into some variable
PID=$(docker inspect -f '{{.State.Pid}}' your-container-name-here)
Make sure the container process is running, and use the variable name to get into the container folder
cd /proc/$PID/root
If you want to get through the dir without finding out the PID number, just use this long command
cd /proc/$(docker inspect -f '{{.State.Pid}}' your-container-name-here)/root
Tips:
After you get inside the container, everything you do will affect the actual process of the container, such as stopping the service or changing the port number.
Hope it helps
Note:
This method only works if the container is still running, otherwise, the directory wouldn't exist anymore if the container has stopped or removed

None of the existing answers address the case of a container that exited (and can't be restarted) and/or doesn't have any shell installed (e.g. distroless ones). This one works as long has you have root access to the Docker host.
For a real manual inspection, find out the layer IDs first:
docker inspect my-container | jq '.[0].GraphDriver.Data'
In the output, you should see something like
"MergedDir": "/var/lib/docker/overlay2/03e8df748fab9526594cfdd0b6cf9f4b5160197e98fe580df0d36f19830308d9/merged"
Navigate into this folder (as root) to find the current visible state of the container filesystem.

This will launch a bash session for the image:
docker run --rm -it --entrypoint=/bin/bash

On newer versions of Docker you can run docker exec [container_name] which runs a shell inside your container
So to get a list of all the files in a container just run docker exec [container_name] ls

I wanted to do this, but I was unable to exec into my container as it had stopped and wasn't starting up again due to some error in my code.
What worked for me was to simply copy the contents of the entire container into a new folder like this:
docker cp container_name:/app/ new_dummy_folder
I was then able to explore the contents of this folder as one would do with a normal folder.

For me, this one works well (thanks to the last comments for pointing out the directory /var/lib/docker/):
chroot /var/lib/docker/containers/2465790aa2c4*/root/
Here, 2465790aa2c4 is the short ID of the running container (as displayed by docker ps), followed by a star.

For docker aufs driver:
The script will find the container root dir(Test on docker 1.7.1 and 1.10.3 )
if [ -z "$1" ] ; then
echo 'docker-find-root $container_id_or_name '
exit 1
fi
CID=$(docker inspect --format {{.Id}} $1)
if [ -n "$CID" ] ; then
if [ -f /var/lib/docker/image/aufs/layerdb/mounts/$CID/mount-id ] ; then
F1=$(cat /var/lib/docker/image/aufs/layerdb/mounts/$CID/mount-id)
d1=/var/lib/docker/aufs/mnt/$F1
fi
if [ ! -d "$d1" ] ; then
d1=/var/lib/docker/aufs/diff/$CID
fi
echo $d1
fi

This answer will help those (like myself) who want to explore the docker volume filesystem even if the container isn't running.
List running docker containers:
docker ps
=> CONTAINER ID "4c721f1985bd"
Look at the docker volume mount points on your local physical machine (https://docs.docker.com/engine/tutorials/dockervolumes/):
docker inspect -f {{.Mounts}} 4c721f1985bd
=> [{ /tmp/container-garren /tmp true rprivate}]
This tells me that the local physical machine directory /tmp/container-garren is mapped to the /tmp docker volume destination.
Knowing the local physical machine directory (/tmp/container-garren) means I can explore the filesystem whether or not the docker container is running. This was critical to helping me figure out that there was some residual data that shouldn't have persisted even after the container was not running.

If you are using Docker v19.03, you follow the below steps.
# find ID of your running container:
docker ps
# create image (snapshot) from container filesystem
docker commit 12345678904b5 mysnapshot
# explore this filesystem
docker run -t -i mysnapshot /bin/sh

For an already running container, you can do:
dockerId=$(docker inspect -f {{.Id}} [docker_id_or_name])
cd /var/lib/docker/btrfs/subvolumes/$dockerId
You need to be root in order to cd into that dir. If you are not root, try 'sudo su' before running the command.
Edit: Following v1.3, see Jiri's answer - it is better.

another trick is to use the atomic tool to do something like:
mkdir -p /path/to/mnt && atomic mount IMAGE /path/to/mnt
The Docker image will be mounted to /path/to/mnt for you to inspect it.

My preferred way to understand what is going on inside container is:
expose -p 8000
docker run -it -p 8000:8000 image
Start server inside it
python -m SimpleHTTPServer

If you are using the AUFS storage driver, you can use my docker-layer script to find any container's filesystem root (mnt) and readwrite layer :
# docker-layer musing_wiles
rw layer : /var/lib/docker/aufs/diff/c83338693ff190945b2374dea210974b7213bc0916163cc30e16f6ccf1e4b03f
mnt : /var/lib/docker/aufs/mnt/c83338693ff190945b2374dea210974b7213bc0916163cc30e16f6ccf1e4b03f
Edit 2018-03-28 :
docker-layer has been replaced by docker-backup

The docker exec command to run a command in a running container can help in multiple cases.
Usage: docker exec [OPTIONS] CONTAINER COMMAND [ARG...]
Run a command in a running container
Options:
-d, --detach Detached mode: run command in the background
--detach-keys string Override the key sequence for detaching a
container
-e, --env list Set environment variables
-i, --interactive Keep STDIN open even if not attached
--privileged Give extended privileges to the command
-t, --tty Allocate a pseudo-TTY
-u, --user string Username or UID (format:
[:])
-w, --workdir string Working directory inside the container
For example :
1) Accessing in bash to the running container filesystem :
docker exec -it containerId bash
2) Accessing in bash to the running container filesystem as root to be able to have required rights :
docker exec -it -u root containerId bash
This is particularly useful to be able to do some processing as root in a container.
3) Accessing in bash to the running container filesystem with a specific working directory :
docker exec -it -w /var/lib containerId bash

Often times I only need to explore the docker filesystem because my build won't run, so docker run -it <container_name> bash is impractical. I also do not want to waste time and memory copying filesystems, so docker cp <container_name>:<path> <target_path> is impractical too.
While possibly unorthodox, I recommend re-building with ls as the final command in the Dockerfile:
CMD [ "ls", "-R" ]

I've found the easiest, all-in-one solution to View, Edit, Copy files with a GUI app inside almost any running container.
mc editing files in docker
inside the container install mc and ssh: docker exec -it <container> /bin/bash, then with prompt install mc and ssh packages
in same exec-bash console, run mc
press ESC then 9 then ENTER to open menu and select "Shell link..."
using "Shell link..." open SCP-based filesystem access to any host with ssh server running (including the one running docker) by it's IP address
do your job in graphical UI
this method overcomes all issues with permissions, snap isolation etc., allows to copy directly to any machine and is the most pleasant to use for me

I had an unknown container, that was doing some production workload and did not want to run any command.
So, I used docker diff.
This will list all files that the container had changed and therefore good suited to explore the container file system.
To get only a folder you can just use grep:
docker diff <container> | grep /var/log
It will not show files from the docker image. Depending on your use case this can help or not.

Late to the party, but in 2022 we have VS Code

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string