Problem
Docker containers started via Jenkins pipeline command
docker.image(imageToStart).inside('--init')
can not be stopped due to zombie processes left by container.
Questions
How is it possible to get zombie processes from a Docker container, when it was started with '--init' option?
Has someone else hit the same issue?
Used environment
Docker 18.03.1-ce
Jenkins 2.60.2
Docker Pipeline plugin 1.12
Details
When a container is started from Jenkins pipeline with a command like:
docker.image('alpine').inside('--init') {
sh ('ps -efa -o pid,ppid,user,comm')
}
There are several processes in this container with parent PID 0:
[Pipeline] withDockerContainer
loco does not seem to be running inside a container
$ docker run -t -d -u 1001:1002 \
--init \
-w /lhome/ci<br>admin/jenkins/workspace/bli-groovy-test \
-v /lhome/ciadmin/jenkins/workspace/bli-groovy-test:/lhome/ciadmin/jenkins/workspace/bli-groovy-test:rw,z \
-v /lhome/ciadmin/jenkins/workspace/bli-groovy-test-tmp:/lhome/ciadmin/jenkins/workspace/bli-groovy-test-tmp:rw,z \
-e ******** \
--entrypoint cat alpine
[Pipeline] {
[Pipeline] sh
[bli-groovy-test] Running shell script
+ ps -efa -o pid,ppid,user,comm
PID PPID USER COMMAND
1 0 1001 init
7 1 1001 cat
8 0 1001 sh
14 8 1001 script.sh
15 14 1001 ps
[Pipeline] }
PID 1 / PPID 0 is the 'init' command used to start the container
PID 8 / PPID 0 is the 'sh' command from the closure to execute 'ps' command
The 'sh' process does not reap its child processes. When the process itself exits its descendants are assigned to a PPID from outside the container and not to PPID 1 from the 'init' process of the container.
The new parent PID is the PID of the 'docker-containerd-shim' process of the container.
With the small example I could not reproduce the zombie processes, but here is the situation from a more complex Jenkins job:
Docker command from Jenkins job
$ docker run -t -d -u 1001:1002 \
--init \
-w /lhome/testadmin/jenkins-coreloops/workspace/test-job/database \
-v /lhome/testadmin/jenkins-coreloops/workspace/test-job/database:/lhome/testadmin/jenkins-coreloops/workspace/test-job/database:rw,z \
-v /lhome/testadmin/jenkins-coreloops/workspace/test-job/database-tmp:/lhome/testadmin/jenkins-coreloops/workspace/test-job/database-tmp:rw,z \
-e ******** \
--entrypoint cat richmond.lhs-systems.com:5000/ait/mpde
[Pipeline] {
[Pipeline] sh
10:03:09 [database] Running shell script
10:03:09 + ./db-upgrade.sh
inside the container shell scripts are started
shell scripts call perl scripts
perl scripts start SQL*Plus (1 instance per database login)
perl scripts send SQL commands to SQL*Plus instances via STDIN
When the closure ends and Jenkins tries to stop the container, the following processes are left:
[testadmin#testhost] ~ # ps -efa | grep -vw grep | grep -w 47077
root 1725 47077 0 10:03 ? 00:00:00 [ps] <defunct>
root 1732 47077 0 10:03 ? 00:00:00 [docker-runc] <defunct>
root 2887 47077 0 10:04 ? 00:00:00 [sqlplus] <defunct>
root 2915 47077 0 10:04 ? 00:00:00 [sqlplus] <defunct>
root 47077 17349 0 10:03 ? 00:00:00 docker-containerd-shim
-namespace moby
-workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/1863503ca54f75168db8ce20c78b821c0e5280f07d59875e8f651db4f0b67d9f
-address /var/run/docker/containerd/docker-containerd.sock
-containerd-binary /usr/bin/docker-containerd
-runtime-root /var/run/docker/runtime-runc
root 47098 47077 0 10:03 pts/0 00:00:00 /dev/init -- cat
root 47506 47077 0 10:03 ? 00:00:00 [sh] <defunct>
[testadmin#testhost] ~ #
and command 'docker stop' is aborted after timeout of 180 seconds.
To cleanup the remaining processes of the container this docker-containerd-shim process has to be killed with SIGKILL.
Note
We observed this issue on our recently installed CentOS server:
- CentOS Linux release 7.5.1804 (Core)
- environment related parts from 'docker info':
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-862.6.3.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 377.6GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
The behavior on other of our Docker hosts is similar with respect to the multiple processes with parent PID 0, but we didn't observe that containers were hanging on shutdown or that there was a similar number of zombie processes.
For comparisation the similar 'docker info' extract from one of these other hosts:
Server Version: 17.05.0-ce
Storage Driver: devicemapper
Pool Name: dock-thinpool
Pool Blocksize: 524.3kB
Base Device Size: 16.11GB
Backing Filesystem: xfs
Data file:
Metadata file:
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-693.11.6.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 48
Total Memory: 377.6GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Edit 2018-Aug-01:
As workaround I added the init process as subreaper also to problematic 'docker exec' calls and problematic 'sh' calls in the Jenkins docker().inside() closure.
This eliminated the zombie processes in our environment.
I'm a new in a Docker, and I've tried to find solution in the google befor ask question - no result.
I decided to learn docker via practical use case - create PostgreSQL container into my VM instance for develop enviroment.
I've been in vacation and didn't check my server several days. Later I tried to connect to my DB, and couldnt - all of my active containers was exited with code 128.
I tried to start again container with DB - docker start django-postgres and got error message - Error response from daemon: OCI runtime create failed: container with id exists: 5c11e724bf52dd1cb6fd10ebda40710385e412981eb269c30071ecc8aac9e805: unknown
Error: failed to start containers: django-postgres
I suspect that somewhere in my system docker keeps some metadata of my container which didn't removed after container was down with code 128, but my knowledge of unix doesn't enough to determine where is it can be. Also, I'm affraid of lost my DB data connected with container.
Some techincal info:
docker version:
Version: 18.03.0-ce
API version: 1.37
Go version: go1.9.4
Git commit: 0520e24
Built: Wed Mar 21 23:10:01 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
docker info
Containers: 9
Running: 2
Paused: 0
Stopped: 7
Images: 5
Server Version: 18.03.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfd04396dc68220d1cecbe686a6cc3aa5ce3667c
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-116-generic
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 488.3MiB
ID: NDUH:OH24:4M4L:TR5O:TOIH:ARV4:LNRP:6QNE:WEYW:TMXR:7KNK:ZPDD
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Does anyone can help my understand my issue and how to fix it without lost data?
N.B. The second container that has been exited with code 128 was OpenVPN. I can't restart it also, but error was differ - cgroups: cannot found cgroup mount destination: unknown
I found solution here (github):
Temp fix is
sudo mkdir /sys/fs/cgroup/systemd
sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
This fix coudn't helped with Postgres container.
It is possible to list all running and stopped containers using docker ps -a. -a or --all Show all containers (default shows just running).
You can find the volumes attached to your old postgres container using docker inspect <container-id> (Maybe pipe to less and search for volumes)
If you want to recover your data, you can attach it to a new postgres container and recover it. (If it is a root volume change target to /)
docker run --name new-postgres \
--mount source=myoldvol,target=/var/lib/postgresql/data -d postgres
And then you can remove the old one by using docker rm <container-id>.
For more information please see,
docker ps,
docker volumes,
docker rm
I have the same issue as in space issue on docker devmapper and CentOS7
It only specifies to clean up but not how I can increase the space and I dont have any images to clean. I tried several things with dm.min_free_space but nothing worked and want to increase the space.
OS Version/build: Red Hat Enterprise Linux Server release 7.3 (Maipo)
App version:
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 96d83a5/1.12.6
Built: Tue Mar 7 09:23:34 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 96d83a5/1.12.6
Built: Tue Mar 7 09:23:34 2017
OS/Arch: linux/amd64
Steps to reproduce
I have no containers running currently and have some docker images pertaining to Kubernetes which will be used by the Kubernetes service.
sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[kubeuser4#kubenode4 Employee]$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/busybox latest 00f017a8c2a6 5 days ago 1.11 MB
registry.access.redhat.com/rhel7/pod-infrastructure latest 34d3450d733b 6 weeks ago 205 MB
docker.io/java 8 d23bdf5b1b1b 8 weeks ago 643.1 MB
gcr.io/google_containers/heapster_grafana v2.6.0-2 b43443930626 12 months ago 230 MB
When I try to create a docker image of my application that needs to be used, I get the below error.
devmapper: Thin Pool has 8783 free data blocks which is less than minimum required 163840 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior
I tried the cleaning up as mentioned in the other forums, but not helped much and getting the same error. When I tried to run with this sudo docker --storage-opt dm.min_free_space=0%, seems like it starts as a daemon, but still it failed with another error "docker-runc not installed on system" and also I dont want to run it as a daemon.
Below are some command outputs
sudo dmsetup status
localvg00-lv_home: 0 20971520 linear
localvg00-lv_home: 20971520 20971520 linear
docker-251:5-134039-pool: 0 209715200 thin-pool 924 848/524288 1629226/1638400 - rw discard_passdown queue_if_no_space
localvg00-lv_tmp: 0 4194304 linear
localvg00-lv_swap: 0 8388608 linear
localvg00-lv_root: 0 2097152 linear
localvg00-lv_root: 2097152 20971520 linear
localvg00-lv_usr: 0 16777216 linear
localvg00-lv_var: 0 8388608 linear
localvg00-lv_var: 8388608 62914560 linear
sudo docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 4
Server Version: 1.12.6
Storage Driver: devicemapper
Pool Name: docker-251:5-134039-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 106.8 GB
Data Space Total: 107.4 GB
Data Space Available: 601.2 MB
Metadata Space Used: 3.473 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.144 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: overlay null bridge host
Swarm: inactive
Runtimes: runc docker-runc
Default Runtime: docker-runc
Security Options: seccomp
Kernel Version: 4.1.12-61.1.28.el7uek.x86_64
Operating System: Oracle Linux Server 7.3
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 2
Total Memory: 7.545 GiB
Name: kubenode4
I had also tried increasing all the physical volume size and logical volume size(lv_var) on my linux machine, but still it doesnt work.
sudo lvs
[sudo] password for kubeuser4:
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv_home localvg00 -wi-ao---- 20.00g
lv_root localvg00 -wi-ao---- 11.00g
lv_swap localvg00 -wi-ao---- 4.00g
lv_tmp localvg00 -wi-ao---- 2.00g
lv_usr localvg00 -wi-ao---- 8.00g
lv_var localvg00 -wi-ao---- 34.00g
sudo ls -lsh /var/lib/docker/devicemapper/devicemapper/data
2.3G -rw------- 1 root root 100G Mar 14 22:16 /var/lib/docker/devicemapper/devicemapper/data
Someone please let me know how it can be done.
Thanks,
It is better move away from devicemapper for a few reasons.
devicemapper in loopback unrecoverable storage issue: https://github.com/docker/docker/issues/3182 "devicemapper not recommended for production use".
I found it easy enough to switch to overlay storage driver, YMMV of course but hopefully not too much. 'rm -rf /var/lib/docker' is somewhat optional when switching but easy and I would highly recommend it as long as you can load your images back in. http://www.projectatomic.io/blog/2015/06/notes-on-fedora-centos-and-docker-storage-drivers/
systemctl stop docker
rm -rf /var/lib/docker
# if these files do not already exist . . . create them, otherwise you need to edit by hand, you can also just add -s overlay in the systemctl docker script
ls /etc/sysconfig/docker /etc/sysconfig/docker-storage
[[ $? != 0 ]] && {
echo OPTIONS='--selinux-enabled=false' > /etc/sysconfig/docker
echo "DOCKER_STORAGE_OPTIONS= -s overlay" > /etc/sysconfig/docker-storage
}
systemctl start docker
systemctl status docker
docker images
more reading:
https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
https://integratedcode.us/2016/08/30/storage-drivers-in-docker-a-deep-dive/
Was able to get it working and have mentioned it in
https://forums.docker.com/t/devmapper-space-issue/29786/3
I'm trying to run docker but it still fails. Here is what i get
root#c1170137:~# docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
c04b14da8d14: Extracting 974 B/974 B
docker: failed to register layer: ApplyLayer exit status 1 stdout: stderr: permission denied.
See 'docker run --help'.
kernel: 4.4.16-1-pve
i'm using debian jessie
Distributor ID: Debian
Description: Debian GNU/Linux 8.5 (jessie)
Release: 8.5
Codename: jessie
Edit:
daemon.log
http://hastebin.com/qinufacuto.coffee
docker info
root#c1177124:~# docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 1.12.1
Storage Driver: vfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options:
Kernel Version: 4.4.16-1-pve
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 2 GiB
Name: c1177124
ID: 4YUJ:OL2E:WLJC:23WJ:5HRW:LRY3:QHKC:MKXO:JDWO:VWOQ:JMWN:V52W
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
127.0.0.0/8
By the way, the problem could be caused by the kernel.
Thank you for any idea or solution
Use lxc.apparmor.profile: unconfined
Just put at the end of an /etc/pve/lxc/ID.conf file and restart your LXC container.
Using lxc.aa_profile: unconfined is deprecated as was renamed.
If you don't care about security or trust your docker containers:
Edit the configuration file of your lxc container on the host in /etc/pve/lxc/ID.conf by adding lxc.aa_profile: unconfined at the end of the file.
Remove apparmor: apt-get remove apparmor --purge
Iam Solved this problem with execute this command on Host:
lxc config set your-lxc-name security.nesting true
lxc config set your-lxc-name security.privileged true
I had the same error. In my case it was due to McAfee antivirus. I removed it and then pull successfully. McAffe was blocking the /etc/passwd folder and Docker could not pull images.
Here people had the same exact problem:
https://github.com/moby/moby/issues/37817
I've been trying to use Docker (1.10) on Ubuntu 16.04 but installation fails because Docker Service doesn't start.
I've already tried to install docker by docker.io, docker-engine apt packages and curl -sSL https://get.docker.com/ | sh but it doesn't work.
My Host info is:
Linux Xenial 4.5.3-040503-generic #201605041831 SMP Wed May 4 22:33:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Here is systemctl status docker.service:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since sáb 2016-05-14 15:17:31 CEST; 12min ago
Docs: https://docs.docker.com
Process: 22479 ExecStart=/usr/bin/docker daemon -H fd:// (code=exited, status=1/FAILURE)
Main PID: 22479 (code=exited, status=1/FAILURE)
may 14 15:17:30 Xenial docker[22479]: time="2016-05-14T15:17:30.103601523+02:00" level=info msg="New containerd process, pid: 22485\n"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.149064723+02:00" level=error msg="devmapper: Unable to delete device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.149127439+02:00" level=warning msg="devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section."
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.153010028+02:00" level=error msg="[graphdriver] prior storage driver \"devicemapper\" failed: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.153130839+02:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31+02:00" level=info msg="stopping containerd after receiving terminated"
may 14 15:17:31 Xenial systemd[1]: Failed to start Docker Application Container Engine.
may 14 15:17:31 Xenial systemd[1]: docker.service: Unit entered failed state.
may 14 15:17:31 Xenial systemd[1]: docker.service: Failed with result 'exit-code'.
Here is sudo docker daemon -D
DEBU[0000] docker group found. gid: 999
DEBU[0000] Listener created for HTTP on unix (/var/run/docker.sock)
INFO[0000] previous instance of containerd still alive (23050)
DEBU[0000] containerd connection state change: CONNECTING
DEBU[0000] Using default logging driver json-file
DEBU[0000] Golang's threads limit set to 55980
DEBU[0000] received past containerd event: &types.Event{Type:"live", Id:"", Status:0x0, Pid:"", Timestamp:0x57372cae}
DEBU[0000] containerd connection state change: READY
DEBU[0000] devicemapper: driver version is 4.34.0
DEBU[0000] devmapper: Generated prefix: docker-8:6-2101297
DEBU[0000] devmapper: Checking for existence of the pool docker-8:6-2101297-pool
DEBU[0000] devmapper: poolDataMajMin=7:0 poolMetaMajMin=7:1
DEBU[0000] devmapper: Major:Minor for device: /dev/loop0 is:7:0
DEBU[0000] devmapper: Major:Minor for device: /dev/loop1 is:7:1
DEBU[0000] devmapper: loadDeviceFilesOnStart()
DEBU[0000] devmapper: Skipping file /var/lib/docker/devicemapper/metadata/transaction-metadata
DEBU[0000] devmapper: loadDeviceFilesOnStart() END
DEBU[0000] devmapper: constructDeviceIDMap()
DEBU[0000] devmapper: constructDeviceIDMap() END
DEBU[0000] devmapper: Rolling back open transaction: TransactionID=1 hash= device_id=1
ERRO[0000] devmapper: Unable to delete device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
WARN[0000] devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section.
DEBU[0000] devmapper: Initializing base device-mapper thin volume
DEBU[0000] devicemapper: CreateDevice(poolName=/dev/mapper/docker-8:6-2101297-pool, deviceID=1)
DEBU[0000] devmapper: Error creating device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
DEBU[0000] devmapper: Error device setupBaseImage: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
ERRO[0000] [graphdriver] prior storage driver "devicemapper" failed: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
DEBU[0000] Cleaning up old mountid : start.
FATA[0000] Error starting daemon: error initializing graphdriver: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
Here is ./check-config.sh output:
warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-4.5.3-040503-generic ...
Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- apparmor: enabled and tools installed
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_MACVLAN: enabled (as module)
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled
Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_KMEM: missing
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: missing
(note that cgroup swap accounting is not enabled in your kernel config, you can enable it by setting boot option "swapaccount=1")
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: missing
- CONFIG_EXT3_FS: missing
- CONFIG_EXT3_FS_XATTR: missing
- CONFIG_EXT3_FS_POSIX_ACL: missing
- CONFIG_EXT3_FS_SECURITY: missing
(enable these ext3 configs if you are using ext3 as backing filesystem)
- CONFIG_EXT4_FS: enabled
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
- "overlay":
- CONFIG_VXLAN: enabled (as module)
- Storage Drivers:
- "aufs":
- CONFIG_AUFS_FS: missing
- "btrfs":
- CONFIG_BTRFS_FS: enabled (as module)
- "devicemapper":
- CONFIG_BLK_DEV_DM: enabled
- CONFIG_DM_THIN_PROVISIONING: enabled (as module)
- "overlay":
- CONFIG_OVERLAY_FS: enabled (as module)
- "zfs":
- /dev/zfs: missing
- zfs command: missing
- zpool command: missing
If someone could please help me I would be very thankful
Update
It seems that in newer versions of docker and Ubuntu the unit file for docker is simply masked (pointing to /dev/null).
You can verify it by running the following commands in the terminal:
sudo file /lib/systemd/system/docker.service
sudo file /lib/systemd/system/docker.socket
You should see that the unit file symlinks to /dev/null.
In this case, all you have to do is follow S34N's suggestion, and run:
sudo systemctl unmask docker.service
sudo systemctl unmask docker.socket
sudo systemctl start docker.service
sudo systemctl status docker
I'll also keep the original post, that answers the error log stating that the storage driver should be replaced:
Original Post
I had the same problem, and I tried fixing it with Salva Cort's suggestion, but printing /etc/default/docker says:
# THIS FILE DOES NOT APPLY TO SYSTEMD
So here's a permanent fix that works for systemd (Ubuntu 15.04 and higher):
create a new file /etc/systemd/system/docker.service.d/overlay.conf with the following content:
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H fd:// -s overlay
flush changes by executing:
sudo systemctl daemon-reload
verify that the configuration has been loaded:
systemctl show --property=ExecStart docker
restart docker:
sudo systemctl restart docker
The following unmasking commands worked for me (Ubuntu 18). Hope it helps someone out there... :-)
sudo systemctl unmask docker.service
sudo systemctl unmask docker.socket
sudo systemctl start docker.service
I had the same problem after upgrade docker from 17.05-ce to 17.06-ce via docker-machine
Update /etc/systemd/system/docker.service.d/10-machine.conf
replace
`docker daemon` => `dockerd`
example from
[Service]
ExecStart=
ExecStart=/usr/bin/docker deamon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic
Environment=
to
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic
Environment=
flush changes by executing:
sudo systemctl daemon-reload
restart docker:
sudo systemctl restart docker
Well, finally I fixed it
Everything you have to do is to load a different storage-driver in my case I will use overlay:
Disable Docker service: sudo systemctl stop docker.service
Start Docker Daemon (overlay driver): sudo docker daemon -s overlay
Run Demo container: sudo docker run hello-world
In order to make these changes permanent, you must edit /etc/default/docker file and add the option:
DOCKER_OPTS="-s overlay"
Next time Docker service get loaded, it will run docker daemon -s overlay
I've been able to get it working after a kernel upgrade by following the directions in this blog.
https://mymemorysucks.wordpress.com/2016/03/31/docker-graphdriver-and-aufs-failed-driver-not-supported-error-after-ubuntu-upgrade/
sudo apt-get update
sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
sudo modprobe aufs
sudo service docker restart
After viewing some of the other answers it looks like the issue was that the service wasn't running with the -s overlay options.
I also happened to notice that docker tried to start up with ${DOCKER_OPTS} at the end of the call.
I was able to export DOCKER_OPTS="-s overlay" (bc by default DOCKER_OPTS was empty) and get docker running.
I had a similar issue on a new Docker installation (version 19.03.3-rc1) on Ubuntu 18.04.3 LTS. By default /etc/docker/daemon.json file does not exist on a new installation. Following a tutorial I changed the storage driver to devicemapper by creating a new daemon.json file. It worked but then I deleted the daemon.json file thinking that it would revert to the default but that did not work and the service would not start.
Creating the /etc/docker/daemon.json file again with the default storage driver fixed it for me.
{
"storage-driver": "overlay2"
}
sudo dockerd --debug will help to fix actual pain point I fixed the same error using this at ubuntu 20 LTS
As to me, I have get this error.
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
Finally I found, it the /etc/docker/daemon.json error, for I add registry-mirrors
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
# I forget to add a comma , here !!!!!!!
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
After I add it , then systemctl restart docker, I solved it.
In my case I was getting the following error from journalctl -xe command
unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'â' looking for beginning of object key string
Just clean /etc/docker/daemon.json with
{
}
I had this issue today after an upgrade to the ubuntu kernel and tried numerous solutions above. However the only one that worked (Ubuntu 16.04.6 LTS) was to remove (or rename) the folder: /var/lib/docker
Please be aware, this will remove all your docker images, containers and volumes etc. So understand the implications before applying or take a backup!
There are more details here:
https://github.com/docker/for-linux/issues/162