Running nginx as non-root in Docker container gives permission denied error - linux

I have the following Dockerfile
FROM ubuntu:14.04
EXPOSE 8000
# Install nginx
RUN apt-get update -q \
&& apt-get install --no-install-recommends --no-install-suggests -y -q \
nginx \
&& rm -rf /var/lib/apt/lists/*
COPY ./nginx.conf /etc/nginx/
COPY ./index.html /usr/share/nginx/test/
RUN groupadd -r webgroup \
&& useradd -r -m -g webgroup webuser \
&& touch /run/nginx.pid \
&& chown -R webuser:webgroup /var/log/nginx /var/lib/nginx /run/nginx.pid
USER webuser
CMD nginx
When I run it I get Permission denied on /var/log/nginx:
mikhails-mbp:test-docker-nginx mkuleshov$ docker run -p 8000:8000 mytest
nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)
2016/10/02 17:02:51 [emerg] 5#0: open() "/var/log/nginx/access.log" failed (13: Permission denied)
If I get into the container with bash I see:
webuser#d190146a0e8d:/var/log/nginx$ ls -la
total 8
drwxr-x--- 2 webuser webgroup 4096 Jun 2 15:16 .
drwxrwxr-x 8 root syslog 4096 Oct 2 17:02 ..
How is it possible? During the above session I also cannot create files under that user.
Thing that helped: Removing the /var/log/nginx and recreating it again. But I have no idea why this happens.
There is no SELinux.
Has anyone encountered anything like that or is there anything I'm doing wrong?
P.S. Here is docker info if it can help
mikhails-mbp:test-docker-nginx mkuleshov$ docker info
Containers: 179
Running: 0
Paused: 0
Stopped: 179
Images: 901
Server Version: 1.11.2
Storage Driver: aufs
Root Dir: /mnt/sda1/var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 1109
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.4.12-boot2docker
Operating System: Boot2Docker 1.11.2 (TCL 7.1); HEAD : a6645c3 - Wed Jun 1 22:59:51 UTC 2016
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.955 GiB
Name: default
ID: 3K5S:3QBN:BXGY:FASS:VG6P:D4CS:UXRK:GYXB:HJQG:SIQH:F6KQ:N4BN
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug mode (client): false
Debug mode (server): true
File Descriptors: 15
Goroutines: 32
System Time: 2016-10-02T17:08:51.355144074Z
EventsListeners: 0
Username: mkuleshov
Registry: https://index.docker.io/v1/
Labels:
provider=virtualbox
P.P.S. Here is a test repo with configs for that case: https://github.com/aides/test-docker-nginx

Most likely adding your user into adm group will solve your issue.
Try sudo usermod -aG adm webuser
More details: https://askubuntu.com/questions/421684/cant-access-apache-error-logs

Related

Incorrect permissions for file with docker compose volume? 13: Permission denied

I have the following docker_compose.yaml:
version: "3.8"
services:
reverse-proxy:
image: nginx:1.17.10
container_name: reverse_proxy
volumes:
- ../nginx/nginx.conf:/etc/nginx/nginx.conf
ports:
- "8050:8050"
- "8051:8051"
webapp:
image: my-site
command: --port 8050 8051 --debug yes
volumes:
- /home/user/data:/data
ports:
- "8050:8050"
- "8051:8051"
depends_on:
- reverse-proxy
When I run via docker compose I get the following error:
$ sudo docker-compose -f /home/user/docker_compose.yaml up
...
reverse_proxy | 2022/03/09 00:49:19 [emerg] 1#1: open() "/etc/nginx/nginx.conf" failed (13: Permission denied)
reverse_proxy | nginx: [emerg] open() "/etc/nginx/nginx.conf" failed (13: Permission denied)
reverse_proxy exited with code 1
So to investigate I re-ran just the nginx container:
$ sudo docker run -v ../nginx/nginx.conf:/etc/nginx/nginx.conf -t docker.io/nginx tail -f /dev/null
ssh'd in and I see:
root#d8e84f89fcad:/# ls -la /etc/nginx/
ls: cannot access '/etc/nginx/nginx.conf': Permission denied
total 20
drwxr-xr-x. 3 root root 132 Mar 1 14:00 .
drwxr-xr-x. 1 root root 66 Mar 9 00:54 ..
drwxr-xr-x. 2 root root 26 Mar 1 14:00 conf.d
-rw-r--r--. 1 root root 1007 Jan 25 15:03 fastcgi_params
-rw-r--r--. 1 root root 5349 Jan 25 15:03 mime.types
lrwxrwxrwx. 1 root root 22 Jan 25 15:13 modules -> /usr/lib/nginx/modules
-?????????? ? ? ? ? ? nginx.conf
-rw-r--r--. 1 root root 636 Jan 25 15:03 scgi_params
-rw-r--r--. 1 root root 664 Jan 25 15:03 uwsgi_params
I consulted the following Q and others and they seem to suggest to just restart the docker service, so I did and I still get ? permissions upon re running.
I assume that this is causing the permission error? If so, how can I set the correct permissions on this nginx config file? Is this really a volume permission issue?
Versions:
Docker version 1.13.1, build 7d71120/1.13.1
docker-compose version 1.29.2, build 5becea4c
CentOS 7
I think it was an SELinux thing, appending :z to the volume fixed it.
volumes:
- ../nginx/nginx.conf:/etc/nginx/nginx.conf:z

chown not working when coping a file in a dockerfile

I'm running docker engine on windows and am trying to add my own file to the image. Problem is that when I copy the file its ownership is always root:root but it needs to be heartbeat:heartbeat (exisitng user on image). Mounting a single file with the -v parameter und docker run doesn't seam to be possible on windows atm. Thats why I tried to create my own image with a docker file:
FROM docker.elastic.co/beats/heartbeat:7.16.3
USER root
COPY --chown=heartbeat:heartbeat yml/heartbeat.yml /usr/share/heartbeat/heartbeat.yml
RUN chown -R heartbeat:heartbeat /usr/share/heartbeat
The --chown parameter behind the coping does nothing. It is still root when I check and the RUN chown command results in a error. Here the output:
docker image build ./ -t custom/heartbeat:7.16.3
Sending build context to Docker daemon 10.75kB
Step 1/4 : FROM docker.elastic.co/beats/heartbeat:7.16.3
---> b64ad4b42006
Step 2/4 : USER root
---> Using cache
---> 922a9121e51b
Step 3/4 : COPY --chown=heartbeat:heartbeat yml/heartbeat.yml /usr/share/heartbeat/heartbeat.yml
---> Using cache
---> f30eb4934dca
Step 4/4 : RUN chown -R heartbeat:heartbeat /usr/share/heartbeat
---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (windows/amd64) and no specific platform was requested
---> Running in 2ae3bfdd5422
The command '/bin/sh -c chown -R heartbeat:heartbeat /usr/share/heartbeat' returned a non-zero code: 4294967295: failed to shutdown container: container 2ae3bfdd5422e81461a14896db0908e4cd67af1a6f99c629abff1e588f62fc32 encountered an error during hcsshim::System::waitBackground: failure in a Windows system call: The virtual machine or container with the specified identifier is not running. (0xc0370110): subsequent terminate failed container 2ae3bfdd5422e81461a14896db0908e4cd67af1a6f99c629abff1e588f62fc32 encountered an error during hcsshim::System::waitBackground: failure in a Windows system call: The virtual machine or container with the specified identifier is not running. (0xc0370110)
All help is welcome...
Running with --platform:
PS C:\SynteticMonitoring> docker image build ./ -t custom/heartbeat:7.16.3
Sending build context to Docker daemon 9.728kB
Step 1/4 : FROM --platform=linux/amd64 docker.elastic.co/beats/heartbeat:7.16.3
---> b64ad4b42006
Step 2/4 : USER root
---> Using cache
---> 922a9121e51b
Step 3/4 : COPY --chown=heartbeat:heartbeat yml/heartbeat.yml /usr/share/heartbeat/heartbeat.yml
---> Using cache
---> f30eb4934dca
Step 4/4 : RUN chmod +r /usr/share/heartbeat/heartbeat.yml
---> Using cache
---> e9a075d2ab53
Successfully built e9a075d2ab53
Successfully tagged custom/heartbeat:7.16.3
PS C:\SynteticMonitoring> docker run --interactive --tty --entrypoint /bin/sh custom/heartbeat:7.16.3
sh-4.2# ls -l
total 106916
-rw-r--r-- 1 root root 13675 Jan 7 00:47 LICENSE.txt
-rw-r--r-- 1 root root 1964303 Jan 7 00:47 NOTICE.txt
-rw-r--r-- 1 root root 851 Jan 7 00:47 README.md
drwxrwxr-x 2 root root 4096 Jan 7 00:48 data
-rw-r--r-- 1 root root 374197 Jan 7 00:47 fields.yml
-rwxr-xr-x 1 root root 107027952 Jan 7 00:47 heartbeat
-rw-r--r-- 1 root root 69196 Jan 7 00:47 heartbeat.reference.yml
-rw-rw-rw- 1 root root 1631 Jan 26 06:49 heartbeat.yml
drwxr-xr-x 2 root root 4096 Jan 7 00:47 kibana
drwxrwxr-x 2 root root 4096 Jan 7 00:48 logs
drwxr-xr-x 2 root root 4096 Jan 7 00:47 monitors.d
sh-4.2# pwd
/usr/share/heartbeat
You can't chown of a file to a user that does not exist. It seems that the heartbeat user and group do not exist in your base image.
That's why the COPY --chown does nothing and you get files owned by root.
You can fix this by creating the user before COPYing. To do this, add a line before your COPY statement, such as:
RUN addgroup heartbeat && adduser -S -H heartbeat -G heartbeat
If you don't have addgroup and adduser in your base image, try alternative:
RUN useradd -rUM -s /usr/sbin/nologin heartbeat
This will create the group and user heartbeat and then chown will be able to successfully change the ownership.
According to Dockerfile documentation:
The optional --platform flag can be used to specify the platform of the image in case FROM references a multi-platform image. For example, linux/amd64, linux/arm64, or windows/amd64. By default, the target platform of the build request is used.
I suggest try something like:
FROM [--platform=<platform>] <image> [AS <name>]
FROM --platform=linux/amd64 docker.elastic.co/beats/heartbeat:7.16.3

docker can't stat directory on external device

Briefly
I'm looking to build docker image from a dockerfile in a directory on an external device.
Context
I have an empty directory /media/nathan/ext/test except for Dockerfile
Dockerfile is : FROM alpine
docker version is : Docker version 20.10.8, build 3967b7d28e
OS is Ubuntu 21.10
I am part of the docker group
mount options :
$> findmnt /media/nathan/ext
TARGET SOURCE FSTYPE OPTIONS
/media/nathan/ext /dev/sda1 ext4 rw,nosuid,nodev,relatime
docker deamon
$> ps aux | grep dockerd
root 919 0.0 0.5 2166356 85600 ? Ssl 09:03 0:08 dockerd --group docker --exec-root=/run/snap.docker --data-root=/var/snap/docker/common/var-lib-docker --pidfile=/run/snap.docker/docker.pid --config-file=/var/snap/docker/1125/config/daemon.json
nathan 19756 0.0 0.0 11844 2448 pts/0 S+ 11:44 0:00 grep --color=auto dockerd
$DOCKER_HOST is undefined
$> echo $DOCKER_HOST
docker info
$> docker info
Client:
Context: default
Debug Mode: false
Server:
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 263
Server Version: 20.10.8
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: e25210fe30a0a703442421b0f60afac609f950a3
runc version:
init version: de40ad0
Security Options:
Expected result
I get a docker image
True result
$> docker build .
error checking context: 'can't stat '/media/nathan/ext/test''.
What I have tried
Just sudo everything
$> sudo docker build .
error checking context: 'can't stat '/media/nathan/ext/test''.
Issue is not resolved
Am I the owner of the context folder ?
$> echo $USER
nathan
$> ls -la
total 12
drwxrwxr-x 2 nathan nathan 4096 nov. 12 10:33 .
drwxr-xr-x 8 nathan root 4096 nov. 12 09:39 ..
-rw-rw-r-- 1 nathan nathan 12 nov. 12 10:32 Dockerfile
As per command above, I am the owner of the context directory. Am I missing something ?
add everything to .dockerignore
I've created a .dockerignore that matches everything : '*'.
Running the command [sudo] docker build . gives a very baffling answer:
$> sudo docker build .
open /media/nathan/ext/test/.dockerignore: permission denied
I do not understand how sudo doesn't have the necessary permissions to read (?) the .dockerfile. Permission which I have set to 777 out of astonishement :
ls -la
total 16
drwxrwxr-x 2 nathan nathan 4096 nov. 12 10:41 .
drwxr-xr-x 8 nathan root 4096 nov. 12 09:39 ..
-rw-rw-r-- 1 nathan nathan 12 nov. 12 10:32 Dockerfile
-rwxrwxrwx 1 nathan nathan 2 nov. 12 10:41 .dockerignore
of course, other programms were capable of reading the file without any issue as expected
$> cat .dockerignore
*
Build outside of external drive
$> pwd
/home/nathan/Bureau/test
$> ls -la
total 12
drwxrwxr-x 2 nathan nathan 4096 nov. 12 10:58 .
drwxr-xr-x 3 nathan nathan 4096 nov. 12 10:56 ..
-rw-rw-r-- 1 nathan nathan 12 nov. 12 10:58 Dockerfile
$> docker build .
Sending build context to Docker daemon 2.048kB
Step 1/1 : FROM alpine
---> 14119a10abf4
Successfully built 14119a10abf4
Image is built, but I which to replicate result into external drive.
running docker build . with journalctl
[...]
nov. 12 11:42:52 nathan-pc systemd[1746]: Started snap.docker.docker.ba3da9ef-34ee-4a63-8ff4-6a56327c5cd2.scope.
nov. 12 11:42:52 nathan-pc audit[19690]: AVC apparmor="DENIED" operation="open" profile="snap.docker.docker" name="/media/nathan/ext/workspace/dino/ntrip-client/RTKLIB/" pid=19690 comm="docker" requested_mask="r" denied_mask="r" fsuid=1000 ouid=1000
nov. 12 11:42:52 nathan-pc kernel: audit: type=1400 audit(1636713772.367:93): apparmor="DENIED" operation="open" profile="snap.docker.docker" name="/media/nathan/ext/workspace/dino/ntrip-client/RTKLIB/" pid=19690 comm="docker" requested_mask="r" denied_mask="r" fsuid=1000 ouid=1000
nov. 12 11:42:52 nathan-pc systemd[1746]: snap.docker.docker.ba3da9ef-34ee-4a63-8ff4-6a56327c5cd2.scope: Deactivated successfully.
[...]
Thank you for your time

How to change the default directory docker uses to build an image

I am trying to set up a gitlab ci.
Because I for some reasons I do not have "gitlab-runner" user and I do not have permission writin on "/home/user_1", this is my installation
/usr/local/bin/gitlab-runner install --user=user_1 --working-directory=/data/external/tmp/gitlab-runner
And this is how I register
/usr/local/bin/gitlab-runner register --url GITLAB_URL --registration-token TOKEN
By the way, I create this gitlab-ci.yml file:
stages:
- deploy
deploy:
stage: deploy
# only:
# - 3.0.x
script:
- echo "deploying"
- sudo docker build -t my_image:v1 .
- echo "********Docker Images********"
- sudo docker image list
- echo "********End of Docker Images********"
- sudo docker run -d -p 3000:5000 --rm --name my_container my_image:v1
tags:
- deploy
I get this error:
Error: error creating build container: Error committing the finished image:
error adding layer with blob "sha256:bb7d5a84853b217ac05783963f12b034243070c1c9c8d2e60ada47444f3cce04":
Error processing tar file(exit status 1):
Error setting up pivot dir: mkdir
/home/user_1/.local/share/containers/storage/overlay/62a747bf1719d2d37fff5670ed40de6900a95743172de1b4434cb019b56f30b4/diff/.pivot_root436648414:
permission denied
I would like to replace /home/user_1/.local/share/containers/storage/overlay/
with another address so that I do not get permission error.
Any advice on how to do so?
I am using Redhat Linux
docker --version is podman version 3.2.3
docker info:
server_name:/home/my_user[ 52 ] --> docker info
host:
arch: amd64
buildahVersion: 1.21.3
cgroupControllers: []
cgroupManager: cgroupfs
cgroupVersion: v1
conmon:
package: conmon-2.0.29-1.module+el8.4.0+11822+6cc1e7d7.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.29, commit: ae467a0c8001179d4d0adf4ada381108a893d7ec'
cpus: 8
distribution:
distribution: '"rhel"'
version: "8.4"
eventLogger: file
hostname: server_name
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
uidmap:
- container_id: 0
host_id: 67298
size: 1
kernel: 4.18.0-305.3.1.el8_4.x86_64
linkmode: dynamic
memFree: 1818484736
memTotal: 33444728832
ociRuntime:
name: runc
package: runc-1.0.0-74.rc95.module+el8.4.0+11822+6cc1e7d7.x86_64
path: /usr/bin/runc
version: |-
runc version spec: 1.0.2-dev
go: go1.15.13
libseccomp: 2.5.1
os: linux
remoteSocket:
path: /run/user/67298/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /bin/slirp4netns
package: slirp4netns-1.1.8-1.module+el8.4.0+11822+6cc1e7d7.x86_64
version: |-
slirp4netns version 1.1.8
commit: d361001f495417b880f20329121e3aa431a8f90f
libslirp: 4.3.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.1
swapFree: 67353165824
swapTotal: 67448598528
uptime: 789h 40m 40.57s (Approximately 32.88 days)
registries:
localhost:
Blocked: false
Insecure: true
Location: localhost
MirrorByDigestOnly: false
Mirrors: []
Prefix: localhost
mkdcvtmaapp01:
Blocked: false
Insecure: true
Location: server_name
MirrorByDigestOnly: false
Mirrors: []
Prefix: server_name
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: /home/my_user/.config/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: overlay
graphOptions:
overlay.mount_program:
Executable: /bin/fuse-overlayfs
Package: fuse-overlayfs-1.6-1.module+el8.4.0+11822+6cc1e7d7.x86_64
Version: |-
fusermount3 version: 3.2.1
fuse-overlayfs: version 1.6
FUSE library version 3.2.1
using FUSE kernel interface version 7.26
graphRoot: /home/my_user/.local/share/containers/storage
graphStatus:
Backing Filesystem: nfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 0
runRoot: /run/user/67298/containers
volumePath: /home/my_user/.local/share/containers/storage/volumes
version:
APIVersion: 3.2.3
Built: 1627570963
BuiltTime: Thu Jul 29 11:02:43 2021
GitCommit: ""
GoVersion: go1.15.7
OsArch: linux/amd64
Version: 3.2.3
I also have tried these three in my gitlab ci but it did not work:
deploy:
variables:
DOCKER_DRIVER: overlay2
DOCKER_TMP: /data/external/tmp_docker_build
TMPDIR: /data/external/tmp_docker_build
I also did chmod 777 on .local, share, containers, storage, and overlay in this rout /home/user_1/.local/share/containers/storage/overlay/ but it is still not working.
I did not know about this before, either. Apparently you can set the data dir used by docker daemon by adding -g /path/to/dir to the docker daemon command.
For example by adding -g to the DOCKER_OPTS in /etc/default/docker on Ubuntu or Debian systems:
DOCKER_OPTS="-dns 8.8.8.8 -dns 8.8.4.4 -g /data/external/docker"
My source is https://forums.docker.com/t/how-do-i-change-the-docker-image-installation-directory/1169 - there is also a note about how this is done on Fedora or CentOS:
edit /etc/sysconfig/docker, and add the -g option in the other_args variable: ex. other_args="-g /var/lib/testdir". If there’s more than one option, make sure you enclose them in " ". After a restart, (service docker restart) Docker should use the new directory.

Zombie processes in Docker also with init / tini system

Problem
Docker containers started via Jenkins pipeline command
docker.image(imageToStart).inside('--init')
can not be stopped due to zombie processes left by container.
Questions
How is it possible to get zombie processes from a Docker container, when it was started with '--init' option?
Has someone else hit the same issue?
Used environment
Docker 18.03.1-ce
Jenkins 2.60.2
Docker Pipeline plugin 1.12
Details
When a container is started from Jenkins pipeline with a command like:
docker.image('alpine').inside('--init') {
sh ('ps -efa -o pid,ppid,user,comm')
}
There are several processes in this container with parent PID 0:
[Pipeline] withDockerContainer
loco does not seem to be running inside a container
$ docker run -t -d -u 1001:1002 \
--init \
-w /lhome/ci<br>admin/jenkins/workspace/bli-groovy-test \
-v /lhome/ciadmin/jenkins/workspace/bli-groovy-test:/lhome/ciadmin/jenkins/workspace/bli-groovy-test:rw,z \
-v /lhome/ciadmin/jenkins/workspace/bli-groovy-test-tmp:/lhome/ciadmin/jenkins/workspace/bli-groovy-test-tmp:rw,z \
-e ******** \
--entrypoint cat alpine
[Pipeline] {
[Pipeline] sh
[bli-groovy-test] Running shell script
+ ps -efa -o pid,ppid,user,comm
PID PPID USER COMMAND
1 0 1001 init
7 1 1001 cat
8 0 1001 sh
14 8 1001 script.sh
15 14 1001 ps
[Pipeline] }
PID 1 / PPID 0 is the 'init' command used to start the container
PID 8 / PPID 0 is the 'sh' command from the closure to execute 'ps' command
The 'sh' process does not reap its child processes. When the process itself exits its descendants are assigned to a PPID from outside the container and not to PPID 1 from the 'init' process of the container.
The new parent PID is the PID of the 'docker-containerd-shim' process of the container.
With the small example I could not reproduce the zombie processes, but here is the situation from a more complex Jenkins job:
Docker command from Jenkins job
$ docker run -t -d -u 1001:1002 \
--init \
-w /lhome/testadmin/jenkins-coreloops/workspace/test-job/database \
-v /lhome/testadmin/jenkins-coreloops/workspace/test-job/database:/lhome/testadmin/jenkins-coreloops/workspace/test-job/database:rw,z \
-v /lhome/testadmin/jenkins-coreloops/workspace/test-job/database-tmp:/lhome/testadmin/jenkins-coreloops/workspace/test-job/database-tmp:rw,z \
-e ******** \
--entrypoint cat richmond.lhs-systems.com:5000/ait/mpde
[Pipeline] {
[Pipeline] sh
10:03:09 [database] Running shell script
10:03:09 + ./db-upgrade.sh
inside the container shell scripts are started
shell scripts call perl scripts
perl scripts start SQL*Plus (1 instance per database login)
perl scripts send SQL commands to SQL*Plus instances via STDIN
When the closure ends and Jenkins tries to stop the container, the following processes are left:
[testadmin#testhost] ~ # ps -efa | grep -vw grep | grep -w 47077
root 1725 47077 0 10:03 ? 00:00:00 [ps] <defunct>
root 1732 47077 0 10:03 ? 00:00:00 [docker-runc] <defunct>
root 2887 47077 0 10:04 ? 00:00:00 [sqlplus] <defunct>
root 2915 47077 0 10:04 ? 00:00:00 [sqlplus] <defunct>
root 47077 17349 0 10:03 ? 00:00:00 docker-containerd-shim
-namespace moby
-workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/1863503ca54f75168db8ce20c78b821c0e5280f07d59875e8f651db4f0b67d9f
-address /var/run/docker/containerd/docker-containerd.sock
-containerd-binary /usr/bin/docker-containerd
-runtime-root /var/run/docker/runtime-runc
root 47098 47077 0 10:03 pts/0 00:00:00 /dev/init -- cat
root 47506 47077 0 10:03 ? 00:00:00 [sh] <defunct>
[testadmin#testhost] ~ #
and command 'docker stop' is aborted after timeout of 180 seconds.
To cleanup the remaining processes of the container this docker-containerd-shim process has to be killed with SIGKILL.
Note
We observed this issue on our recently installed CentOS server:
- CentOS Linux release 7.5.1804 (Core)
- environment related parts from 'docker info':
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-862.6.3.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 377.6GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
The behavior on other of our Docker hosts is similar with respect to the multiple processes with parent PID 0, but we didn't observe that containers were hanging on shutdown or that there was a similar number of zombie processes.
For comparisation the similar 'docker info' extract from one of these other hosts:
Server Version: 17.05.0-ce
Storage Driver: devicemapper
Pool Name: dock-thinpool
Pool Blocksize: 524.3kB
Base Device Size: 16.11GB
Backing Filesystem: xfs
Data file:
Metadata file:
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-693.11.6.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 377.6GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Edit 2018-Aug-01:
As workaround I added the init process as subreaper also to problematic 'docker exec' calls and problematic 'sh' calls in the Jenkins docker().inside() closure.
This eliminated the zombie processes in our environment.

Resources