ElasticSearch docker container remains in Exited status - linux

Recently installed Docker, ElasticSearch 7.17.6. docker-compose up -d worked fine
but when trying to bring up the ElasticSearch container, its status remains Exited(1) & can't start the container.
Command to start: sudo docker container start <container-ID>
See below Exception for command: sudo docker logs <Container-ID>
Exception in thread "main" java.nio.file.NoSuchFileException: /usr/share/elasticsearch/config/jvm.options
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
at java.base/java.nio.file.Files.newByteChannel(Files.java:432)
at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:422)
at java.base/java.nio.file.Files.newInputStream(Files.java:160)
at org.elasticsearch.tools.launchers.JvmOptionsParser.readJvmOptionsFiles(JvmOptionsParser.java:168)
at org.elasticsearch.tools.launchers.JvmOptionsParser.jvmOptions(JvmOptionsParser.java:124)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:86)
var/log/messages file shows below error:
msg="error detaching from network es_elastic: could not find network attachment for container <Container-ID> to network es_elastic"
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.567246932-04:00" level=info msg="initialized VXLAN UDP port to 4789 "
Nov 1 13:11:07 ES11 kernel: br0: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: br0: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: ov-1-f: renamed from br0
Nov 1 13:11:07 ES11 kernel: device <Device_name2> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: device <Device_name1> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: vx-1-f: renamed from <Device_name1>
Nov 1 13:11:07 ES11 kernel: : renamed from <Device_name2>
Nov 1 13:11:07 ES11 avahi-daemon[891]: Withdrawing workstation service for vx-1-f.
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7289] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/144)
Nov 1 13:11:07 ES11 kernel: : renamed from eth0
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7693] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/145)
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.ID-04:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object

File jvm.options were missing from the config directory at /u01/es11/config but unsure why the error shows different location and how come it worked in other nodes es12 & es13. But containers started fine after placing these files.

Related

Could not generate persistent MAC address for veth476ff90: No such file or directory

I'm seeing this error message in my Docker Swarm manager:
11278 Feb 6 10:07:08 swarm-manager systemd-udevd[3557149]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11279 Feb 6 10:07:08 swarm-manager systemd-udevd[3557149]: Could not generate persistent MAC address for veth7bdf732: No such file or directory
11280 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1388] manager: (veth7bdf732): new Veth device (/org/freedesktop/NetworkManager/Devices/36528)
11281 Feb 6 10:07:08 swarm-manager systemd-udevd[3557148]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11282 Feb 6 10:07:08 swarm-manager systemd-udevd[3557148]: Could not generate persistent MAC address for vethe6bef2b: No such file or directory
11283 Feb 6 10:07:08 swarm-manager kernel: veth1: renamed from veth7bdf732
11284 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered blocking state
11285 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered disabled state
11286 Feb 6 10:07:08 swarm-manager kernel: device veth1 entered promiscuous mode
11287 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered blocking state
11288 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered forwarding state
11289 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered blocking state
11290 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered disabled state
11291 Feb 6 10:07:08 swarm-manager kernel: device veth17599e0 entered promiscuous mode
11292 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered blocking state
11293 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered forwarding state
11294 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1755] manager: (veth476ff90): new Veth device (/org/freedesktop/NetworkManager/Devices/36529)
11295 Feb 6 10:07:08 swarm-manager systemd-udevd[3557161]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11296 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1765] manager: (veth17599e0): new Veth device (/org/freedesktop/NetworkManager/Devices/36530)
11297 Feb 6 10:07:08 swarm-manager systemd-udevd[3557161]: Could not generate persistent MAC address for veth476ff90: No such file or directory
I would like to know the real impact of it as all the services are running properly but at some point I noticed some instability in my system, generating in some way an OOM that may lead to a PostgreSQL database inconsistency.
I also looked at the same issue over Stackoverflow and github but none of them is approaching me to a solution.
My OS is Oracle Linux 8.6

Docker daemon service crash very often on worker node

Docker service stops very often on one of my remote worker node.
I am not able to figure out why this is happening?
OS: Ubuntu 19.04
** Log: journelctl -xe**
Mar 12 10:43:44 machine1 systemd-networkd[434]: vethc827a75: Gained IPv6LL
Mar 12 10:43:44 machine1 kernel: docker_gwbridge: port 2(veth7e595dc) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered blocking state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered forwarding state
Mar 12 10:43:45 machine1 kernel: veth2: renamed from veth3b5a70d
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: device veth2 entered promiscuous mode
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered forwarding state
Mar 12 10:43:45 machine1 kernel: br0: port 3(veth1) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered disabled state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 4(vethcb2c2a4) entered blocking state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 4(vethcb2c2a4) entered disabled state
Mar 12 10:43:45 machine1 kernel: device vethcb2c2a4 entered promiscuous mode
Mar 12 10:43:45 machine1 systemd-udevd[2887]: Could not generate persistent MAC address for vethc361b7b: No such file or directory
Mar 12 10:43:45 machine1 systemd-udevd[2890]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2890]: Could not generate persistent MAC address for veth7574e8b: No such file or directory
Mar 12 10:43:45 machine1 kernel: veth2: renamed from veth6691f49
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: device veth2 entered promiscuous mode
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered forwarding state
Mar 12 10:43:45 machine1 systemd-udevd[2937]: link_config: could not get ethtool features for vethbf19a70
Mar 12 10:43:45 machine1 systemd-udevd[2937]: Could not set offload features of vethbf19a70: No such device
Mar 12 10:43:45 machine1 systemd-udevd[2889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2891]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2891]: link_config: could not get ethtool features for veth3b5a70d
Mar 12 10:43:45 machine1 systemd-udevd[2891]: Could not set offload features of veth3b5a70d: No such device
Mar 12 10:43:45 machine1 systemd-udevd[2885]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2885]: Could not generate persistent MAC address for veth2100695: No such file or directory
Mar 12 10:43:45 machine1 systemd-udevd[2884]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.

docker exec: rpc error: code = 2 desc = oci runtime error: exec failed

every time I try to do:
$ docker exec
I get the error message:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
Session 1 (works like expected):
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
alpine latest baa5d63471ea 7 weeks ago 4.8 MB
hello-world latest c54a2cc56cbb 5 months ago 1.85 kB
$ docker run --rm --name alpine -it alpine sh
/ # pwd
/
Session 2:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7bd39b37aee2 alpine "sh" 22 seconds ago Up 21 seconds alpine
$ docker exec -it alpine sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
$ docker exec -it 7bd39b37aee2 sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
/var/log/syslog shows some warnings, but I was neither able to understand the root cause not finding matching answers.
Thanks for any hint.
= = = = = = = = = = = = = = = = = = = = = = = = =
$ docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.13.0-rc3
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 4
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
= = =
/var/log/syslog docker restart and steps above
= = =
Dec 13 14:28:09 pb7tt6ts systemd[1]: Stopping Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 14:28:09 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:09.291301057+01:00" level=info msg="libcontainerd: new containerd process, pid: 1448"
Dec 13 14:28:10 pb7tt6ts kernel: [25908.125394] audit: type=1400 audit(1481635690.357:28): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=1466 comm="apparmor_parser"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.370364923+01:00" level=info msg="[graphdriver] using prior storage driver: aufs"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.387915069+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388367650+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388465142+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388508739+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.389419384+01:00" level=info msg="Loading containers: start."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.397339748+01:00" level=info msg="Firewalld running: false"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.628011070+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.743703578+01:00" level=info msg="Loading containers: done."
Dec 13 14:28:10 pb7tt6ts kernel: [25908.510718] aufs au_opts_verify:1597:dockerd[1462]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808510166+01:00" level=info msg="Daemon has completed initialization"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808575966+01:00" level=info msg="Docker daemon" commit=4d92237 graphdriver=aufs version=1.13.0-rc3
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.820562161+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 14:28:10 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:16 pb7tt6ts kernel: [25914.206672] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.388393] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.492197] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7320] device (vethff6f844): failed to find device 35 'vethff6f844' with udev
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7340] manager: (vethff6f844): new Veth device (/org/freedesktop/NetworkManager/Devices/46)
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1614]: Could not generate persistent MAC address for vethff6f844: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7345] device (veth13c2a1d): failed to find device 36 'veth13c2a1d' with udev
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1615]: Could not generate persistent MAC address for veth13c2a1d: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7417] manager: (veth13c2a1d): new Veth device (/org/freedesktop/NetworkManager/Devices/47)
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509027] device veth13c2a1d entered promiscuous mode
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509240] IPv6: ADDRCONF(NETDEV_UP): veth13c2a1d: link is not ready
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] devices added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] device added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7639] devices added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7640] device added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965015836+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965090775+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965117179+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:17 pb7tt6ts kernel: [25914.808163] eth0: renamed from vethff6f844
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0599] devices removed (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: A new network interface has been detected.
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0600] device (vethff6f844): driver 'veth' does not support carrier detection.
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: logInterfaces File: RouteMgr.cpp Line: 2105 Invoked Function: logInterfaces Return Code: 0 (0x00000000) Description: IP Address Interface List: 192.168.178.24 172.17.0.1 9.145.68.34 FE80:0:0:0:D8B4:C1E0:F8E4:DB77 FE80:0:0:0:42:44FF:FEC9:5D85 FE80:0:0:0:60A9:A1FF:FEED:F31C
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0604] device (veth13c2a1d): link connected
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0605] device (docker0): link connected
Dec 13 14:28:17 pb7tt6ts kernel: [25914.823988] IPv6: ADDRCONF(NETDEV_CHANGE): veth13c2a1d: link becomes ready
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824039] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824061] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:18 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface veth13c2a1d.IPv6 with address fe80::60a9:a1ff:feed:f31c.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: New relevant interface veth13c2a1d.IPv6 for mDNS.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::60a9:a1ff:feed:f31c on veth13c2a1d.*.
Dec 13 14:28:32 pb7tt6ts kernel: [25929.850840] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.704565159+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.705362948+01:00" level=error msg="Handler for POST /v1.25/exec/8a78f29ef71d4c3ab982a8dd7a4a325e280766072dea7337860874a72c42f42c/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.921880770+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.922576933+01:00" level=error msg="Handler for POST /v1.25/exec/5ad25668cac553118b8c702f02c69b427436eb67d1488d4170641bcacfdad50b/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
As recommended I reverted to a main version of docker and installed docker-engine 1.12.4
$ docker info
Containers: 2
Running: 1
Paused: 0
Stopped: 1
Images: 3
Server Version: 1.12.4
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 11
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
Furthermore, no success but different error:
$ docker exec -it alpine sh
rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 17\\\"\"\n"
Corresponding /var/log/syslog from service docker start (21:00), docker run ... (21:01), docker exec ... (21:01)
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 21:00:01 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:01.468921183+01:00" level=info msg="libcontainerd: new containerd process, pid: 8686"
Dec 13 21:00:02 pb7tt6ts kernel: [49419.124965] audit: type=1400 audit(1481659202.536:37): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=8700 comm="apparmor_parser"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.550070413+01:00" level=info msg="[graphdriver] using prior storage driver \"aufs\""
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572067603+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572336166+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572799562+01:00" level=info msg="Loading containers: start."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.579465999+01:00" level=info msg="Firewalld running: false"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.779165187+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903085523+01:00" level=info msg="Loading containers: done."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903179108+01:00" level=info msg="Daemon has completed initialization"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903208197+01:00" level=info msg="Docker daemon" commit=1564f02 graphdriver=aufs version=1.12.4
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.923282443+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 21:00:02 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 21:01:01 pb7tt6ts kernel: [49477.834789] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49477.896566] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.080340] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.192100] aufs au_opts_verify:1597:dockerd[8682]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6125] device (veth2b5b07c): failed to find device 47 'veth2b5b07c' with udev
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8810]: Could not generate persistent MAC address for vethc2e4873: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.196917] device vethc2e4873 entered promiscuous mode
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6215] manager: (veth2b5b07c): new Veth device (/org/freedesktop/NetworkManager/Devices/63)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6222] device (vethc2e4873): failed to find device 48 'vethc2e4873' with udev
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6241] manager: (vethc2e4873): new Veth device (/org/freedesktop/NetworkManager/Devices/64)
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8809]: Could not generate persistent MAC address for veth2b5b07c: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.211913] IPv6: ADDRCONF(NETDEV_UP): vethc2e4873: link is not ready
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] devices added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] device added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] devices added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] device added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts kernel: [49478.557310] eth0: renamed from veth2b5b07c
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9915] devices removed (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9916] device (veth2b5b07c): driver 'veth' does not support carrier detection.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9919] device (vethc2e4873): link connected
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9937] device (docker0): link connected
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573434] IPv6: ADDRCONF(NETDEV_CHANGE): vethc2e4873: link becomes ready
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573503] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573527] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface vethc2e4873.IPv6 with address fe80::d02a:ecff:fea8:662c.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: New relevant interface vethc2e4873.IPv6 for mDNS.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::d02a:ecff:fea8:662c on vethc2e4873.*.
Dec 13 21:01:17 pb7tt6ts kernel: [49493.628038] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072027206+01:00" level=error msg="Error running exec in container: rpc error: code = 13 desc = invalid header field value \"oci runtime error: exec failed: container_linux.go:247: starting container process caused \\\"process_linux.go:83: executing setns process caused \\\\\\\"exit status 17\\\\\\\"\\\"\\n\""
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072759152+01:00" level=error msg="Handler for POST /v1.24/exec/00c0dcac7a178129a17cd9eb833d154d428f2a6efbcd0f421ab3c5c54e52a236/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
From the linked issue is this comment which appears to be the root cause:
I think I found the root reason. It's nothing to do with Docker.
Actually docker exec always fail because of Symantec AutoProtect
running on my system. It loads a custom kernel module that add some
file operation hooks, which affects the result of setns.
$ lsmod | grep symev
symev_custom_dkms_x86_64 72166 2 symap_custom_dkms_x86_64
The workaround is to disable Symantec AutoProtect and reboot.
sudo update-rc.d autoprotect disable

Weave takes my node offline when I hit a container's ip

I'm running weave with kubernetes/cni . I have a wordpress/mysql pod running on a kube minion. When I hit the url of the wordpress service via the browser, my node goes down (on azure). I upgraded to 8cores and 14gb ram and now when I do hit the wordpress url I find that I can't access the internet i.e. curl google.com which I could do before hitting the wordpress url.
I was curious so I tail -f /var/log/syslog and found the following which may be relevant; Please note I have an nginx pod on the same node and I can access it's url without incident, however the following happens when I hit the wordpress installation page:
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.723678] device vethwepl4fe8074 entered promiscuous mode
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.732271] eth0: renamed from vethwepg4fe8074
Jul 21 16:38:27 sc-minion-1 systemd-udevd[8388]: Could not generate persistent MAC address for vethwepl4fe8074: No such file or directory
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744321] IPv6: ADDRCONF(NETDEV_UP): vethwepl4fe8074: link is not ready
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744661] IPv6: ADDRCONF(NETDEV_CHANGE): vethwepl4fe8074: link becomes ready
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744730] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744736] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:27 sc-minion-1 docker[883]: time="2016-07-21T16:38:27.987193241Z" level=error msg="Handler for GET /images/nginx/json returned error: No such image: nginx"
Jul 21 16:38:27 sc-minion-1 kubelet[2088]: I0721 16:38:27.987696 2088 provider.go:91] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
Jul 21 16:38:28 sc-minion-1 kernel: [ 6320.564156] IPv6: eth0: IPv6 duplicate address fe80::3c2b:b1ff:fe47:fe26 detected!
Jul 21 16:38:42 sc-minion-1 kernel: [ 6334.748036] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:42 sc-minion-1 docker[883]: time="2016-07-21T16:38:42.988502303Z" level=warning msg="Error getting v2 registry: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
Jul 21 16:38:42 sc-minion-1 docker[883]: time="2016-07-21T16:38:42.988545804Z" level=error msg="Attempting next endpoint for pull after error: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
Jul 21 16:39:02 sc-minion-1 docker[883]: time="2016-07-21T16:39:02.989940573Z" level=error msg="Not continuing with pull after error: Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/nginx/images. You may want to check your internet connection or if you are behind a proxy."

Failing to start systemd service

I have written the following systemd service to login at the wireless at boot:
[Unit]
Description=Wireless network connectivity (%i)
Wants=network.target
Before=network.target
BindsTo=sys-subsystem-net-devices-%i-device
After=sys-subsystem-net-devices-%i-device
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/ip link set dev %i up
ExecStart=/usr/bin/wpa_supplicant -B -i %i -c /etc/wpa_supplicant/wpa_supplicant.conf
ExecStart=/usr/bin/dhcpcd %i
ExecStop=/usr/bin/ip link set dev %i down
[Install]
WantedBy=multi-user.target
I then enable it but I get the following error every time I boot my computer:
[abc#arch ~]$ systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● network-wireless#wlp3s0.service loaded failed failed Wireless network connectivity (wlp3s0)
However if I manually start this service after boot with:
systemctl start network-wireless#xlp3s0
the service starts as expected.
This is the content of wpa_supplicant.conf
ctrl_interface=/var/run/wpa_supplicant
ctrl_interface_group=wheel
network={
ssid="TeliaGateway30-91-8F-1C-B2-29"
#psk="A80871A90A"
psk=b4d8a1e9ad665eed0178fea6f141134e795e15183a661848b371a41bb73a6844
}
Why is this services starting ok when starting it manually but not at boot and how can I change it to start at boot?
EDIT: Added error output:
This is what error im getting:
[abc#arch ~]$ journalctl -b -u network-wireless#wlp3s0.service
-- Logs begin at Sat 2015-08-22 12:50:42 CEST, end at Sun 2015-08-23 22:15:26 CEST. --
Aug 23 21:23:36 arch systemd[1]: Starting Wireless network connectivity (wlp3s0)...
Aug 23 21:23:36 arch ip[274]: Cannot find device "wlp3s0"
Aug 23 21:23:36 arch systemd[1]: network-wireless#wlp3s0.service: Main process exited, code=exited, status=1/FAILURE
Aug 23 21:23:36 arch systemd[1]: Failed to start Wireless network connectivity (wlp3s0).
Aug 23 21:23:37 arch systemd[1]: network-wireless#wlp3s0.service: Unit entered failed state.
Aug 23 21:23:37 arch systemd[1]: network-wireless#wlp3s0.service: Failed with result 'exit-code'.
Aug 23 21:25:11 arch systemd[1]: Starting Wireless network connectivity (wlp3s0)...
Aug 23 21:25:11 arch dhcpcd[424]: wlp3s0: waiting for carrier
Aug 23 21:25:16 arch dhcpcd[424]: wlp3s0: carrier acquired
Aug 23 21:25:16 arch dhcpcd[424]: DUID 00:01:00:01:1d:6b:6b:e6:10:0d:7f:b7:30:f3
Aug 23 21:25:16 arch dhcpcd[424]: wlp3s0: IAID c1:c4:73:e0
Aug 23 21:25:16 arch dhcpcd[424]: wlp3s0: soliciting an IPv6 router
Aug 23 21:25:16 arch dhcpcd[424]: wlp3s0: rebinding lease of 192.168.1.85
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: leased 192.168.1.85 for 3600 seconds
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: adding route to 192.168.1.0/24
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: adding default route via 192.168.1.1
Aug 23 21:25:21 arch dhcpcd[424]: forked to background, child pid 477
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: waiting for carrier
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: carrier acquired
Aug 23 21:25:21 arch dhcpcd[424]: DUID 00:01:00:01:1d:6b:6b:e6:10:0d:7f:b7:30:f3
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: IAID c1:c4:73:e0
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: soliciting an IPv6 router
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: rebinding lease of 192.168.1.85
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: leased 192.168.1.85 for 3600 seconds
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: adding route to 192.168.1.0/24
Aug 23 21:25:21 arch dhcpcd[424]: wlp3s0: adding default route via 192.168.1.1
Aug 23 21:25:21 arch dhcpcd[424]: forked to background, child pid 477
Aug 23 21:25:21 arch systemd[1]: Started Wireless network connectivity (wlp3s0).
Aug 23 21:25:28 arch dhcpcd[477]: wlp3s0: no IPv6 Routers available
Aug 23 22:15:09 arch dhcpcd[477]: wlp3s0: carrier lost
Aug 23 22:15:09 arch dhcpcd[477]: wlp3s0: deleting route to 192.168.1.0/24
Aug 23 22:15:09 arch dhcpcd[477]: wlp3s0: deleting default route via 192.168.1.1
Aug 23 22:15:13 arch dhcpcd[477]: wlp3s0: carrier acquired
Aug 23 22:15:14 arch dhcpcd[477]: wlp3s0: IAID c1:c4:73:e0
Aug 23 22:15:14 arch dhcpcd[477]: wlp3s0: soliciting an IPv6 router
Aug 23 22:15:14 arch dhcpcd[477]: wlp3s0: rebinding lease of 192.168.1.85
Aug 23 22:15:19 arch dhcpcd[477]: wlp3s0: leased 192.168.1.85 for 3600 seconds
Aug 23 22:15:19 arch dhcpcd[477]: wlp3s0: adding route to 192.168.1.0/24
Aug 23 22:15:19 arch dhcpcd[477]: wlp3s0: adding default route via 192.168.1.1
Aug 23 22:15:19 arch dhcpcd[477]: wlp3s0: removing route to 192.168.1.0/24
Aug 23 22:15:26 arch dhcpcd[477]: wlp3s0: no IPv6 Routers available
EDIT:
I have found one potetential error, it seems as if the network interface changes name from wlan0 during boot, however i have tried starting the service with wlan0 but with no change in the result.
The reason the service stops at boot is because the ip command cannot find the interface. According to the man page for systemd services that is enough reason to fail the service.
As you noticed yourself the interface gets renamed or is not ready yet at boot.
You can check if you need to add an After= statement
You can check with the systemd-analyze command if the ordering at boot is correct
You could split up the service and make it more robust. Most daemons can start fine even if the interface is not ready yet.
Personally I would make dhcpd and wpa_supplicant separate services and use systemd's networkd or an udev rule to bring up the interface (if that is even needed). There are a lot examples of unit files for wpa_supplicant and dhcpd online maybe have look at those?

Resources