Docker daemon service crash very often on worker node - linux

Docker service stops very often on one of my remote worker node.
I am not able to figure out why this is happening?
OS: Ubuntu 19.04
** Log: journelctl -xe**
Mar 12 10:43:44 machine1 systemd-networkd[434]: vethc827a75: Gained IPv6LL
Mar 12 10:43:44 machine1 kernel: docker_gwbridge: port 2(veth7e595dc) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered blocking state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered forwarding state
Mar 12 10:43:45 machine1 kernel: veth2: renamed from veth3b5a70d
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: device veth2 entered promiscuous mode
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered forwarding state
Mar 12 10:43:45 machine1 kernel: br0: port 3(veth1) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 3(veth7574e8b) entered disabled state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 4(vethcb2c2a4) entered blocking state
Mar 12 10:43:45 machine1 kernel: docker_gwbridge: port 4(vethcb2c2a4) entered disabled state
Mar 12 10:43:45 machine1 kernel: device vethcb2c2a4 entered promiscuous mode
Mar 12 10:43:45 machine1 systemd-udevd[2887]: Could not generate persistent MAC address for vethc361b7b: No such file or directory
Mar 12 10:43:45 machine1 systemd-udevd[2890]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2890]: Could not generate persistent MAC address for veth7574e8b: No such file or directory
Mar 12 10:43:45 machine1 kernel: veth2: renamed from veth6691f49
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered disabled state
Mar 12 10:43:45 machine1 kernel: device veth2 entered promiscuous mode
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered blocking state
Mar 12 10:43:45 machine1 kernel: br0: port 4(veth2) entered forwarding state
Mar 12 10:43:45 machine1 systemd-udevd[2937]: link_config: could not get ethtool features for vethbf19a70
Mar 12 10:43:45 machine1 systemd-udevd[2937]: Could not set offload features of vethbf19a70: No such device
Mar 12 10:43:45 machine1 systemd-udevd[2889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2891]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2891]: link_config: could not get ethtool features for veth3b5a70d
Mar 12 10:43:45 machine1 systemd-udevd[2891]: Could not set offload features of veth3b5a70d: No such device
Mar 12 10:43:45 machine1 systemd-udevd[2885]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 12 10:43:45 machine1 systemd-udevd[2885]: Could not generate persistent MAC address for veth2100695: No such file or directory
Mar 12 10:43:45 machine1 systemd-udevd[2884]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.

Related

Could not generate persistent MAC address for veth476ff90: No such file or directory

I'm seeing this error message in my Docker Swarm manager:
11278 Feb 6 10:07:08 swarm-manager systemd-udevd[3557149]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11279 Feb 6 10:07:08 swarm-manager systemd-udevd[3557149]: Could not generate persistent MAC address for veth7bdf732: No such file or directory
11280 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1388] manager: (veth7bdf732): new Veth device (/org/freedesktop/NetworkManager/Devices/36528)
11281 Feb 6 10:07:08 swarm-manager systemd-udevd[3557148]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11282 Feb 6 10:07:08 swarm-manager systemd-udevd[3557148]: Could not generate persistent MAC address for vethe6bef2b: No such file or directory
11283 Feb 6 10:07:08 swarm-manager kernel: veth1: renamed from veth7bdf732
11284 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered blocking state
11285 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered disabled state
11286 Feb 6 10:07:08 swarm-manager kernel: device veth1 entered promiscuous mode
11287 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered blocking state
11288 Feb 6 10:07:08 swarm-manager kernel: br0: port 3(veth1) entered forwarding state
11289 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered blocking state
11290 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered disabled state
11291 Feb 6 10:07:08 swarm-manager kernel: device veth17599e0 entered promiscuous mode
11292 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered blocking state
11293 Feb 6 10:07:08 swarm-manager kernel: docker_gwbridge: port 3(veth17599e0) entered forwarding state
11294 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1755] manager: (veth476ff90): new Veth device (/org/freedesktop/NetworkManager/Devices/36529)
11295 Feb 6 10:07:08 swarm-manager systemd-udevd[3557161]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
11296 Feb 6 10:07:08 swarm-manager NetworkManager[951]: <info> [1675674428.1765] manager: (veth17599e0): new Veth device (/org/freedesktop/NetworkManager/Devices/36530)
11297 Feb 6 10:07:08 swarm-manager systemd-udevd[3557161]: Could not generate persistent MAC address for veth476ff90: No such file or directory
I would like to know the real impact of it as all the services are running properly but at some point I noticed some instability in my system, generating in some way an OOM that may lead to a PostgreSQL database inconsistency.
I also looked at the same issue over Stackoverflow and github but none of them is approaching me to a solution.
My OS is Oracle Linux 8.6

ElasticSearch docker container remains in Exited status

Recently installed Docker, ElasticSearch 7.17.6. docker-compose up -d worked fine
but when trying to bring up the ElasticSearch container, its status remains Exited(1) & can't start the container.
Command to start: sudo docker container start <container-ID>
See below Exception for command: sudo docker logs <Container-ID>
Exception in thread "main" java.nio.file.NoSuchFileException: /usr/share/elasticsearch/config/jvm.options
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
at java.base/java.nio.file.Files.newByteChannel(Files.java:432)
at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:422)
at java.base/java.nio.file.Files.newInputStream(Files.java:160)
at org.elasticsearch.tools.launchers.JvmOptionsParser.readJvmOptionsFiles(JvmOptionsParser.java:168)
at org.elasticsearch.tools.launchers.JvmOptionsParser.jvmOptions(JvmOptionsParser.java:124)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:86)
var/log/messages file shows below error:
msg="error detaching from network es_elastic: could not find network attachment for container <Container-ID> to network es_elastic"
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.567246932-04:00" level=info msg="initialized VXLAN UDP port to 4789 "
Nov 1 13:11:07 ES11 kernel: br0: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: br0: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: ov-1-f: renamed from br0
Nov 1 13:11:07 ES11 kernel: device <Device_name2> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: device <Device_name1> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: vx-1-f: renamed from <Device_name1>
Nov 1 13:11:07 ES11 kernel: : renamed from <Device_name2>
Nov 1 13:11:07 ES11 avahi-daemon[891]: Withdrawing workstation service for vx-1-f.
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7289] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/144)
Nov 1 13:11:07 ES11 kernel: : renamed from eth0
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7693] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/145)
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.ID-04:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object
File jvm.options were missing from the config directory at /u01/es11/config but unsure why the error shows different location and how come it worked in other nodes es12 & es13. But containers started fine after placing these files.

Bluetooth on raspberry 4 without Linux

I'm working on non-Linux OS and now trying to enable bluetooth on Raspberry Pi 4.
I have some necessary drivers such as: gpio, uart (pl011 and mini-uart), mailbox and expgpio through that mailbox.
To enable bluetooth I make some steps:
I configure GPIOs as described in Linux's dts to make UART0 connected
to BT/WiFi chip;
I set BT_ON expgpio to 1 through mailbox (it is made by default, just ensure);
I wrote some command to UART0 and nothing happened =( UART driver return success and reading command answer is always timeouted.
I think I could forget some step for initialization procedure, but as I can see in Linux log there is only firmware downloading and many commands, such as read device name, can be executed prior to it.
May be I forget to enable some clock source or a regulator, but I don't have any idea where start my research.
There is a part of Raspbian kernel log with additional debug info:
Jan 28 05:17:13 raspberrypi kernel: [ 15.321055] Bluetooth: Core ver 2.22
Jan 28 05:17:13 raspberrypi kernel: [ 15.321093] device class 'bluetooth': registering
Jan 28 05:17:13 raspberrypi kernel: [ 15.321149] NET: Registered PF_BLUETOOTH protocol family
Jan 28 05:17:13 raspberrypi kernel: [ 15.321158] Bluetooth: HCI device and connection manager initialized
Jan 28 05:17:13 raspberrypi kernel: [ 15.321176] Bluetooth: HCI socket layer initialized
Jan 28 05:17:13 raspberrypi kernel: [ 15.321189] Bluetooth: L2CAP socket layer initialized
Jan 28 05:17:13 raspberrypi kernel: [ 15.321208] Bluetooth: SCO socket layer initialized
Jan 28 05:17:13 raspberrypi kernel: [ 15.335356] Bluetooth: HCI UART driver ver 2.3
Jan 28 05:17:13 raspberrypi kernel: [ 15.335377] Bluetooth: HCI UART protocol H4 registered at id 0
Jan 28 05:17:13 raspberrypi kernel: [ 15.335387] bus: 'serial': add driver hci_uart_h5
Jan 28 05:17:13 raspberrypi kernel: [ 15.335456] Bluetooth: HCI UART protocol Three-wire (H5) registered at id 2
Jan 28 05:17:13 raspberrypi kernel: [ 15.335480] bus: 'platform': add driver hci_bcm
Jan 28 05:17:13 raspberrypi kernel: [ 15.335641] bus: 'serial': add driver hci_uart_bcm
Jan 28 05:17:13 raspberrypi kernel: [ 15.335679] Bluetooth: HCI UART protocol Broadcom registered at id 7
Jan 28 05:17:13 raspberrypi kernel: [ 15.337922] Bluetooth: TTY name ttyAMA0
Jan 28 05:17:13 raspberrypi kernel: [ 15.338543] Bluetooth: hci_uart_register_dev
Jan 28 05:17:13 raspberrypi kernel: [ 15.338599] device: 'hci0': device_add
Jan 28 05:17:13 raspberrypi kernel: [ 15.345358] device: 'rfkill1': device_add
Jan 28 05:17:13 raspberrypi kernel: [ 15.345497] Bluetooth: HCI UART protocol set. Proto H4; id 0
Jan 28 05:17:13 raspberrypi kernel: [ 15.345530] Bluetooth: hci_uart_open hci0 5d898f04
Jan 28 05:17:13 raspberrypi kernel: [ 15.345543] Bluetooth: hci_uart_setup: START
Jan 28 05:17:13 raspberrypi kernel: [ 15.345550] Bluetooth: hci_uart_setup: init speed = 0
Jan 28 05:17:13 raspberrypi kernel: [ 15.345557] Bluetooth: hci_uart_setup: oper speed = 0
Jan 28 05:17:13 raspberrypi kernel: [ 15.352975] Bluetooth: hci0: type 1 len 3
Jan 28 05:17:13 raspberrypi kernel: [ 15.353010] Bluetooth skb: 00000000: 01 03 10 00
Jan 28 05:17:13 raspberrypi kernel: [ 15.353026] Bluetooth: hci_uart_write_work written 4
Jan 28 05:17:13 raspberrypi kernel: [ 15.353760] Bluetooth: hci0: type 1 len 3
Jan 28 05:17:13 raspberrypi kernel: [ 15.353826] Bluetooth skb: 00000000: 01 01 10 00
....
a lot of lines
....
Jan 28 05:17:13 raspberrypi btuart[479]: bcm43xx_init
Jan 28 05:17:13 raspberrypi btuart[479]: Flash firmware /lib/firmware/brcm/BCM4345C0.hcd
Jan 28 05:17:13 raspberrypi btuart[479]: Set Controller UART speed to 3000000 bit/s
Jan 28 05:17:13 raspberrypi btuart[479]: Device setup complete
Jan 28 05:17:13 raspberrypi systemd[1]: Starting Load/Save RF Kill Switch Status...
Jan 28 05:17:13 raspberrypi systemd[1]: Started Configure Bluetooth Modems connected by UART.
Jan 28 05:17:13 raspberrypi systemd[1]: Reached target Multi-User System.
Jan 28 05:17:13 raspberrypi systemd[1]: Reached target Graphical Interface.
Jan 28 05:17:13 raspberrypi systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jan 28 05:17:13 raspberrypi systemd[625]: Reached target Bluetooth.
Jan 28 05:17:13 raspberrypi systemd[1]: Started Load/Save RF Kill Switch Status.
Jan 28 05:17:13 raspberrypi systemd[1]: Created slice system-bthelper.slice.
Jan 28 05:17:13 raspberrypi systemd[1]: Starting Raspberry Pi bluetooth helper...
Jan 28 05:17:13 raspberrypi systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Jan 28 05:17:13 raspberrypi systemd[1]: Finished Update UTMP about System Runlevel Changes.
Jan 28 05:17:13 raspberrypi bthelper[774]: Raspberry Pi BDADDR already set
Jan 28 05:17:13 raspberrypi systemd[1]: Finished Raspberry Pi bluetooth helper.
Jan 28 05:17:13 raspberrypi kernel: [ 15.490868] Bluetooth: hci0: type 1 len 8
Jan 28 05:17:13 raspberrypi kernel: [ 15.490909] Bluetooth skb: 00000000: 01 1c fc 05 01 02 00 01 01
Jan 28 05:17:13 raspberrypi kernel: [ 15.490930] Bluetooth: hci_uart_write_work written 9
Thank you in advance
For H4 protocol UART with Hardware Flow Control must be used. Adding HFC support to PL011 UART driver resolves the problem.

Pop OS / Dell XPS 9310 -- battery drained overnight on suspend

My laptop is suspending on lid close successfully, but if I don't have it plugged in overnight, the battery is drained by the morning.
I'm including logs from a short suspend I ran just now. I can suspend it overnight and look at the logs afterward, but is there anything immediately suspicious here? I validated that all suspend-related targets are loaded via sudo systemctl status sleep.target suspend.target hibernate.target hybrid-sleep.target
Apr 11 22:09:29 pop-os systemd[1]: Reached target Sleep.
Apr 11 22:09:29 pop-os systemd[1]: Starting Suspend...
Apr 11 22:09:29 pop-os kernel: [ 44.986190] PM: suspend entry (s2idle)
Apr 11 22:09:29 pop-os systemd-sleep[3730]: Suspending system...
Apr 11 22:09:29 pop-os kernel: [ 44.991600] Filesystems sync: 0.005 seconds
Apr 11 22:09:57 pop-os kernel: [ 44.994638] Freezing user space processes ... (elapsed 0.002 seconds) done.
Apr 11 22:09:57 pop-os kernel: [ 44.996920] OOM killer disabled.
Apr 11 22:09:57 pop-os kernel: [ 44.996921] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Apr 11 22:09:57 pop-os kernel: [ 44.998055] printk: Suspending console(s) (use no_console_suspend to debug)
Apr 11 22:09:57 pop-os kernel: [ 45.315954] psmouse serio1: Failed to disable mouse on isa0060/serio1
Apr 11 22:09:57 pop-os kernel: [ 46.377203] ACPI: EC: interrupt blocked
Apr 11 22:09:57 pop-os kernel: [ 72.605807] ACPI: EC: interrupt unblocked
Apr 11 22:09:57 pop-os kernel: [ 73.107660] pcieport 10000:e0:06.0: can't derive routing for PCI INT A
Apr 11 22:09:57 pop-os kernel: [ 73.107666] nvme 10000:e1:00.0: PCI INT A: no GSI
Apr 11 22:09:57 pop-os kernel: [ 73.114494] nvme nvme0: 8/0/0 default/read/poll queues
Apr 11 22:09:57 pop-os kernel: [ 73.363725] OOM killer enabled.
Apr 11 22:09:57 pop-os kernel: [ 73.363728] Restarting tasks ...
Apr 11 22:09:57 pop-os kernel: [ 73.364154] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Apr 11 22:09:57 pop-os kernel: [ 73.367166] done.
Apr 11 22:09:57 pop-os touchegg[1000]: libinput error: event0 - Lid Switch: client bug: event processing lagging behind by 1279ms, your system is too slow
Apr 11 22:09:57 pop-os /usr/libexec/gdm-x-session[1823]: (II) modeset(0): EDID vendor "SHP", prod id 5370
Apr 11 22:09:57 pop-os /usr/libexec/gdm-x-session[1823]: (II) modeset(0): Printing DDC gathered Modelines:
Apr 11 22:09:57 pop-os /usr/libexec/gdm-x-session[1823]: (II) modeset(0): Modeline "3840x2400"x0.0 592.50 3840 3888 3920 4000 2400 2403 2409 2469 -hsync -vsync (148.1 kHz eP)
Apr 11 22:09:57 pop-os /usr/libexec/gdm-x-session[1823]: (II) modeset(0): Modeline "3840x2400"x0.0 474.00 3840 3888 3920 4000 2400 2403 2409 2469 -hsync -vsync (118.5 kHz e)
Apr 11 22:09:57 pop-os systemd-sleep[3730]: System resumed.
Apr 11 22:09:57 pop-os bluetoothd[961]: Controller resume with wake event 0x0
Apr 11 22:09:57 pop-os kernel: [ 73.413202] PM: suspend exit
Apr 11 22:09:57 pop-os systemd[1]: systemd-suspend.service: Succeeded.
Apr 11 22:09:57 pop-os systemd[1]: Finished Suspend.
Apr 11 22:09:57 pop-os systemd[1]: Stopped target Sleep.
Apr 11 22:09:57 pop-os systemd[1]: Reached target Suspend.
Apr 11 22:09:57 pop-os systemd[1]: Stopped target Suspend.
Apr 11 22:09:57 pop-os NetworkManager[968]: <info> [1649729397.3461] manager: sleep: wake requested (sleeping: yes enabled: yes)
Apr 11 22:09:57 pop-os NetworkManager[968]: <info> [1649729397.3461] device (wlp113s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Apr 11 22:09:57 pop-os ModemManager[1079]: <info> [sleep-monitor] system is resuming
Apr 11 22:09:57 pop-os NetworkManager[968]: <info> [1649729397.4258] manager: NetworkManager state is now DISCONNECTED
The hardware on this system only supports s2idle sleep, and not deep sleep for less energy consumption (details on different sleep states here https://www.kernel.org/doc/Documentation/power/states.txt).
pop-os:$~ sudo cat /sys/power/mem_sleep
[s2idle]
I found this thread: https://www.dell.com/community/XPS/XPS-13-9310-Ubuntu-deep-sleep-missing/td-p/7734008 It suggests changing the disk management from RAID (Dell's default) to AHCI via the Dell BIOS.
So far this has worked for a solution! I've lost only 10% battery overnight, and can go 3 days idling in suspend without a charge.
(Before this, I did try enabling hibernate through these instructions from System76 https://support.system76.com/articles/enable-hibernation/. This does not work great, because the Killer wifi driver does not load on wake from hibernate.)
Suspend ( considering hybrid suspend ), the machine's state is stored in swap space and suspend via RAM (aka sleep) is invoked. This caused for minimal utilisation of power.
Reason to do so : wake up from hibernate is slower than wakeup from sleep. So to ensure system state is not lost, machine's state is stored in swap space and sleep is invoked that uses minimal power and does not shut off the machine. Machine's state is stored in RAM. If battery does not die, wake up happens from RAM which is faster.
Read More : https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate
In case you want your battery to not die or drain, switch your lid close action from sleep/suspend to hibernate. Hibernate has zero power consumption. Follow the steps mentioned below.
$ grep HandleLidSwitch /etc/systemd/logind.conf
HandleLidSwitch=suspend
If the line is commented, please uncomment by removing "#" and change option to hibernate.
HandleLidSwitch=hibernate
If you are new to Linux, please use gedit command to edit the file.
sudo gedit /etc/systemd/logind.conf

docker exec: rpc error: code = 2 desc = oci runtime error: exec failed

every time I try to do:
$ docker exec
I get the error message:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
Session 1 (works like expected):
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
alpine latest baa5d63471ea 7 weeks ago 4.8 MB
hello-world latest c54a2cc56cbb 5 months ago 1.85 kB
$ docker run --rm --name alpine -it alpine sh
/ # pwd
/
Session 2:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7bd39b37aee2 alpine "sh" 22 seconds ago Up 21 seconds alpine
$ docker exec -it alpine sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
$ docker exec -it 7bd39b37aee2 sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
/var/log/syslog shows some warnings, but I was neither able to understand the root cause not finding matching answers.
Thanks for any hint.
= = = = = = = = = = = = = = = = = = = = = = = = =
$ docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.13.0-rc3
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 4
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
= = =
/var/log/syslog docker restart and steps above
= = =
Dec 13 14:28:09 pb7tt6ts systemd[1]: Stopping Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 14:28:09 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:09.291301057+01:00" level=info msg="libcontainerd: new containerd process, pid: 1448"
Dec 13 14:28:10 pb7tt6ts kernel: [25908.125394] audit: type=1400 audit(1481635690.357:28): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=1466 comm="apparmor_parser"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.370364923+01:00" level=info msg="[graphdriver] using prior storage driver: aufs"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.387915069+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388367650+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388465142+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388508739+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.389419384+01:00" level=info msg="Loading containers: start."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.397339748+01:00" level=info msg="Firewalld running: false"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.628011070+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.743703578+01:00" level=info msg="Loading containers: done."
Dec 13 14:28:10 pb7tt6ts kernel: [25908.510718] aufs au_opts_verify:1597:dockerd[1462]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808510166+01:00" level=info msg="Daemon has completed initialization"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808575966+01:00" level=info msg="Docker daemon" commit=4d92237 graphdriver=aufs version=1.13.0-rc3
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.820562161+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 14:28:10 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:16 pb7tt6ts kernel: [25914.206672] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.388393] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.492197] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7320] device (vethff6f844): failed to find device 35 'vethff6f844' with udev
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7340] manager: (vethff6f844): new Veth device (/org/freedesktop/NetworkManager/Devices/46)
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1614]: Could not generate persistent MAC address for vethff6f844: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7345] device (veth13c2a1d): failed to find device 36 'veth13c2a1d' with udev
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1615]: Could not generate persistent MAC address for veth13c2a1d: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7417] manager: (veth13c2a1d): new Veth device (/org/freedesktop/NetworkManager/Devices/47)
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509027] device veth13c2a1d entered promiscuous mode
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509240] IPv6: ADDRCONF(NETDEV_UP): veth13c2a1d: link is not ready
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] devices added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] device added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7639] devices added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7640] device added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965015836+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965090775+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965117179+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:17 pb7tt6ts kernel: [25914.808163] eth0: renamed from vethff6f844
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0599] devices removed (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: A new network interface has been detected.
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0600] device (vethff6f844): driver 'veth' does not support carrier detection.
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: logInterfaces File: RouteMgr.cpp Line: 2105 Invoked Function: logInterfaces Return Code: 0 (0x00000000) Description: IP Address Interface List: 192.168.178.24 172.17.0.1 9.145.68.34 FE80:0:0:0:D8B4:C1E0:F8E4:DB77 FE80:0:0:0:42:44FF:FEC9:5D85 FE80:0:0:0:60A9:A1FF:FEED:F31C
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0604] device (veth13c2a1d): link connected
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0605] device (docker0): link connected
Dec 13 14:28:17 pb7tt6ts kernel: [25914.823988] IPv6: ADDRCONF(NETDEV_CHANGE): veth13c2a1d: link becomes ready
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824039] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824061] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:18 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface veth13c2a1d.IPv6 with address fe80::60a9:a1ff:feed:f31c.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: New relevant interface veth13c2a1d.IPv6 for mDNS.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::60a9:a1ff:feed:f31c on veth13c2a1d.*.
Dec 13 14:28:32 pb7tt6ts kernel: [25929.850840] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.704565159+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.705362948+01:00" level=error msg="Handler for POST /v1.25/exec/8a78f29ef71d4c3ab982a8dd7a4a325e280766072dea7337860874a72c42f42c/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.921880770+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.922576933+01:00" level=error msg="Handler for POST /v1.25/exec/5ad25668cac553118b8c702f02c69b427436eb67d1488d4170641bcacfdad50b/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
As recommended I reverted to a main version of docker and installed docker-engine 1.12.4
$ docker info
Containers: 2
Running: 1
Paused: 0
Stopped: 1
Images: 3
Server Version: 1.12.4
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 11
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
Furthermore, no success but different error:
$ docker exec -it alpine sh
rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 17\\\"\"\n"
Corresponding /var/log/syslog from service docker start (21:00), docker run ... (21:01), docker exec ... (21:01)
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 21:00:01 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:01.468921183+01:00" level=info msg="libcontainerd: new containerd process, pid: 8686"
Dec 13 21:00:02 pb7tt6ts kernel: [49419.124965] audit: type=1400 audit(1481659202.536:37): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=8700 comm="apparmor_parser"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.550070413+01:00" level=info msg="[graphdriver] using prior storage driver \"aufs\""
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572067603+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572336166+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572799562+01:00" level=info msg="Loading containers: start."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.579465999+01:00" level=info msg="Firewalld running: false"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.779165187+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903085523+01:00" level=info msg="Loading containers: done."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903179108+01:00" level=info msg="Daemon has completed initialization"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903208197+01:00" level=info msg="Docker daemon" commit=1564f02 graphdriver=aufs version=1.12.4
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.923282443+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 21:00:02 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 21:01:01 pb7tt6ts kernel: [49477.834789] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49477.896566] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.080340] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.192100] aufs au_opts_verify:1597:dockerd[8682]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6125] device (veth2b5b07c): failed to find device 47 'veth2b5b07c' with udev
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8810]: Could not generate persistent MAC address for vethc2e4873: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.196917] device vethc2e4873 entered promiscuous mode
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6215] manager: (veth2b5b07c): new Veth device (/org/freedesktop/NetworkManager/Devices/63)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6222] device (vethc2e4873): failed to find device 48 'vethc2e4873' with udev
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6241] manager: (vethc2e4873): new Veth device (/org/freedesktop/NetworkManager/Devices/64)
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8809]: Could not generate persistent MAC address for veth2b5b07c: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.211913] IPv6: ADDRCONF(NETDEV_UP): vethc2e4873: link is not ready
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] devices added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] device added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] devices added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] device added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts kernel: [49478.557310] eth0: renamed from veth2b5b07c
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9915] devices removed (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9916] device (veth2b5b07c): driver 'veth' does not support carrier detection.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9919] device (vethc2e4873): link connected
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9937] device (docker0): link connected
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573434] IPv6: ADDRCONF(NETDEV_CHANGE): vethc2e4873: link becomes ready
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573503] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573527] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface vethc2e4873.IPv6 with address fe80::d02a:ecff:fea8:662c.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: New relevant interface vethc2e4873.IPv6 for mDNS.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::d02a:ecff:fea8:662c on vethc2e4873.*.
Dec 13 21:01:17 pb7tt6ts kernel: [49493.628038] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072027206+01:00" level=error msg="Error running exec in container: rpc error: code = 13 desc = invalid header field value \"oci runtime error: exec failed: container_linux.go:247: starting container process caused \\\"process_linux.go:83: executing setns process caused \\\\\\\"exit status 17\\\\\\\"\\\"\\n\""
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072759152+01:00" level=error msg="Handler for POST /v1.24/exec/00c0dcac7a178129a17cd9eb833d154d428f2a6efbcd0f421ab3c5c54e52a236/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
From the linked issue is this comment which appears to be the root cause:
I think I found the root reason. It's nothing to do with Docker.
Actually docker exec always fail because of Symantec AutoProtect
running on my system. It loads a custom kernel module that add some
file operation hooks, which affects the result of setns.
$ lsmod | grep symev
symev_custom_dkms_x86_64 72166 2 symap_custom_dkms_x86_64
The workaround is to disable Symantec AutoProtect and reboot.
sudo update-rc.d autoprotect disable

Resources