Docker flag "--gpu" does not work without sudo command - linux

I'm ubuntu user. I use the following docker image, tensorflow/tensorflow:nightly-gpu
If I try to run this command
$ docker run -it --rm --gpus all tensorflow/tensorflow:nightly-gpu bash
There's permission denied error.
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: open failed: /sys/fs/cgroup/devices/user.slice/devices.allow: permission denied: unknown.
Of course, I can run this command if I am using sudo, but I want to use gpu without sudo.
Is there any good solution? Any leads, please?

As your problem seems to be only when running "--gpu".
Add/update these two sections of /etc/nvidia-container-runtime/config.toml
[nvidia-container-cli]
no-cgroups = true
[nvidia-container-runtime]
debug = "/tmp/nvidia-container-runtime.log"
Source: https://github.com/containers/podman/issues/3659#issuecomment-543912380
If you can't use docker without sudo at all
If you are running in a Linux environment, you need to create a user for docker so you won't need to use sudo every time. Below are the steps to create:
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker
Source: https://docs.docker.com/engine/install/linux-postinstall/

Related

how to mount a disk partition in docker

I have the below sd card partition from sudo blkid
/dev/sdb1: PARTLABEL="uboot" PARTUUID="5e6c4af7-015f-46df-9426-d27fb38f1d87"
...
...
...
/dev/sdb8: UUID="5f38be2e-3d5d-4c42-8d66-8aa6edc3eede" BLOCK_SIZE="1024" TYPE="ext2" PARTLABEL="userdata" PARTUUID="dceeb110-7c3e-4973-b6ba-c60f8734c988"
/dev/sdb9: UUID="51e83a43-830f-48de-bcea-309a784ea35c" BLOCK_SIZE="4096" TYPE="ext4" PARTLABEL="rootfs" PARTUUID="c58164a5-704a-4017-aeea-739a0941472f"
I am trying to mount /dev/sdb9 into a docker container so that I can reformat it and do other stuffs with it.
But I am not able to attach it as a volume in docker container.
This is what I've done:
docker volume create --driver=local --opt type=ext4 --opt device=/dev/disk/by-uuid/51e83a43-830f-48de-bcea-309a784ea35c my-vol
docker run <image id> -v my-vol:/my-vol -it bash
However, it came up with the error: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "-v": executable file not found in $PATH: unknown.
Any ideas how i can mount /dev/sdb9 into a docker container?
You need to change the order of your docker run command so that the options come before the image. Everything after the image is considered as args, you need to provide options such as volume before the image name. From the docker run docs https://docs.docker.com/engine/reference/commandline/container_run/:
docker container run [OPTIONS] IMAGE [COMMAND] [ARG...]
$ docker run -it ubuntu -v $(pwd):/local
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "-v": executable file not found in $PATH: unknown.
$ docker run -it -v $(pwd):/local ubuntu
root#8fa69b8861d8:/#

OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: <PATH>:

I've created a astronomer airflow directory home\acoppers\astronomer. I ran docker db init and docker astro start to get my containers running. I want to authenticate my scheduler container to gcloud so I tried the command:
docker container exec -it 6903e8589b00 /home/acoppers/google-cloud-sdk/bin/gcloud auth application-default login --no-launch-browser
Since I installed google-cloud-sdk in my home directory. However I am getting the following error when I run this command:
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "/home/acoppers/google-cloud-sdk/bin/gcloud": stat /home/acoppers/google-cloud-sdk/bin/gcloud: no such file or directory: unknown
Can someone tell me what I am doing wrong? Thank you.
Someone might find this useful. I was unable exec into the docker container like above. I got:
OCI runtime exec failed: exec failed: container_linux.go:380:
starting container process caused: setup user: no such file or directory: unknown
Turned out - in my case - NodeJS child process caused /dev/null to disappear as soon as I restored it
mknod /dev/null c 1 3
chmod 666 /dev/null
I was able to log in again (tested with two shells one was in the other out)

Docker cannot connect to daemon set after running docker restart

I have a bash script that is running on an ec2 instance.It is running periodically,now the issue is that whenever in the script i run
sudo service docker start
it runs but after that when i run docker ps it gives me this error
ERROR: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock
Now while i have run the command in the script to provide permissions to run docker without sudo but whenever i restart it it seems to rewrite all the configurations.
echo "Installing Docker..."
sudo yum install -y docker
#sudo groupadd docker
#sudo usermod -aG docker ${USER}
tries=3
interval=15s
while [ $tries -gt 0 ]
do
#sudo yum reinstall -y docker
#sudo service docker start
sudo groupadd docker
sudo chmod 777 /var/run/docker.sock
sudo usermod -aG docker ${USER}
sudo service docker start
sudo chkconfig docker on
docker --version
sudo service docker restart
#sudo service docker start
docker info && break
let "tries--" && sleep $interval
done
docker info || exit
It gives me this error
Stopping docker: [60G[[0;32m OK [0;39m]
Starting docker: .[60G[[0;32m OK [0;39m]
Client:
Context: default
Debug Mode: false
Server:
ERROR: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": dial unix /var/run/docker.sock: connect: permission denied
errors pretty printing info
is there any way to run docker without sudo even after running the restart command. Keep in mind that i cannot remove the restart command
You can try this instead
sudo chmod 666 /var/run/docker.sock

How to stop a container using ssh connection?

I have this error when I try to stop a container with sudo docker stop pg:
Error response from daemon: cannot stop container: pg: Cannot kill container 9cead43f288336d418e91105d5c9a4e0858794c96ebd167e5e92784d8ed1eab2: unknown error after kill: docker-runc did not terminate sucessfully: container_linux.go:393: signaling init process caused "permission denied"
When we run docker-compose up -d then everything works fine but when we run docker-compose down or docker-compose restart then we got permission denied error.
It seems that apparmor it's blocking the access, you can try stop it and try again to stop the container:
sudo systemctl stop apparmor && systemctl disable apparmor
sudo docker stop pg
And then run it again with the flag:
--security-opt apparmor=unconfined

How to run a docker login command as a different user?

How to run a docker login command as a different user?
sudo -u gitlab-runner docker login xxx
error:
Warning: failed to get default registry endpoint from daemon (Got permission denied while trying to connect to the Docker daemon socket at <unix:///var/run/docker.sock>: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.30/info: dial unix /var/run/docker.sock: connect: permission denied). Using system default: https://index.docker.io/v1/
i even tried this:
su - gitlab-runner -c docker login xxx
Add the user to the docker group
sudo usermod -a -G docker gitlab-runner
and then execute this command:
sudo -u gitlab-runner docker login xxx

Resources