Weave takes my node offline when I hit a container's ip - azure

I'm running weave with kubernetes/cni . I have a wordpress/mysql pod running on a kube minion. When I hit the url of the wordpress service via the browser, my node goes down (on azure). I upgraded to 8cores and 14gb ram and now when I do hit the wordpress url I find that I can't access the internet i.e. curl google.com which I could do before hitting the wordpress url.
I was curious so I tail -f /var/log/syslog and found the following which may be relevant; Please note I have an nginx pod on the same node and I can access it's url without incident, however the following happens when I hit the wordpress installation page:
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.723678] device vethwepl4fe8074 entered promiscuous mode
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.732271] eth0: renamed from vethwepg4fe8074
Jul 21 16:38:27 sc-minion-1 systemd-udevd[8388]: Could not generate persistent MAC address for vethwepl4fe8074: No such file or directory
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744321] IPv6: ADDRCONF(NETDEV_UP): vethwepl4fe8074: link is not ready
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744661] IPv6: ADDRCONF(NETDEV_CHANGE): vethwepl4fe8074: link becomes ready
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744730] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:27 sc-minion-1 kernel: [ 6319.744736] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:27 sc-minion-1 docker[883]: time="2016-07-21T16:38:27.987193241Z" level=error msg="Handler for GET /images/nginx/json returned error: No such image: nginx"
Jul 21 16:38:27 sc-minion-1 kubelet[2088]: I0721 16:38:27.987696 2088 provider.go:91] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
Jul 21 16:38:28 sc-minion-1 kernel: [ 6320.564156] IPv6: eth0: IPv6 duplicate address fe80::3c2b:b1ff:fe47:fe26 detected!
Jul 21 16:38:42 sc-minion-1 kernel: [ 6334.748036] weave: port 3(vethwepl4fe8074) entered forwarding state
Jul 21 16:38:42 sc-minion-1 docker[883]: time="2016-07-21T16:38:42.988502303Z" level=warning msg="Error getting v2 registry: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
Jul 21 16:38:42 sc-minion-1 docker[883]: time="2016-07-21T16:38:42.988545804Z" level=error msg="Attempting next endpoint for pull after error: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
Jul 21 16:39:02 sc-minion-1 docker[883]: time="2016-07-21T16:39:02.989940573Z" level=error msg="Not continuing with pull after error: Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/nginx/images. You may want to check your internet connection or if you are behind a proxy."

Related

ElasticSearch docker container remains in Exited status

Recently installed Docker, ElasticSearch 7.17.6. docker-compose up -d worked fine
but when trying to bring up the ElasticSearch container, its status remains Exited(1) & can't start the container.
Command to start: sudo docker container start <container-ID>
See below Exception for command: sudo docker logs <Container-ID>
Exception in thread "main" java.nio.file.NoSuchFileException: /usr/share/elasticsearch/config/jvm.options
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
at java.base/java.nio.file.Files.newByteChannel(Files.java:432)
at java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:422)
at java.base/java.nio.file.Files.newInputStream(Files.java:160)
at org.elasticsearch.tools.launchers.JvmOptionsParser.readJvmOptionsFiles(JvmOptionsParser.java:168)
at org.elasticsearch.tools.launchers.JvmOptionsParser.jvmOptions(JvmOptionsParser.java:124)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:86)
var/log/messages file shows below error:
msg="error detaching from network es_elastic: could not find network attachment for container <Container-ID> to network es_elastic"
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.567246932-04:00" level=info msg="initialized VXLAN UDP port to 4789 "
Nov 1 13:11:07 ES11 kernel: br0: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: br0: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: ov-1-f: renamed from br0
Nov 1 13:11:07 ES11 kernel: device <Device_name2> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 2(<Device_name2>) entered disabled state
Nov 1 13:11:07 ES11 kernel: device <Device_name1> left promiscuous mode
Nov 1 13:11:07 ES11 kernel: ov-1-f: port 1(<Device_name1>) entered disabled state
Nov 1 13:11:07 ES11 kernel: vx-1-f: renamed from <Device_name1>
Nov 1 13:11:07 ES11 kernel: : renamed from <Device_name2>
Nov 1 13:11:07 ES11 avahi-daemon[891]: Withdrawing workstation service for vx-1-f.
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7289] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/144)
Nov 1 13:11:07 ES11 kernel: : renamed from eth0
Nov 1 13:11:07 ES11 NetworkManager[999]: <info> [ID.7693] manager: (): new Veth device (/org/freedesktop/NetworkManager/Devices/145)
Nov 1 13:11:07 ES11 dockerd: time="2022-11-01T13:11:07.ID-04:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object
File jvm.options were missing from the config directory at /u01/es11/config but unsure why the error shows different location and how come it worked in other nodes es12 & es13. But containers started fine after placing these files.

Job for httpd.service failed because the control process exited with error code See "systemctl status httpd.service" and "journalctl -xe" for details

I am unable to restart my apache server to successfully install the SSL certificates.
I get the following error
Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
I have tried several articles and the root cause seems to be the following
Mar 29 13:05:09 localhost.localdomain httpd\[1234546\]: (98)Address already in use: AH00072: make_sock: could not bind to address \[::\]:80
Mar 29 13:05:09 localhost.localdomain httpd\[1234546\]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
I am able to diagnose the issue and get the following output and is also attached. I am unable to proceed further. Can you please help ?
Server - AlmaLinux 8
Host - IONOS
Server version: Apache/2.4.37 (AlmaLinux)
-- Unit session-62994.scope has finished starting up.
-
-- Unit session-62994.scope has finished starting up.
-
-- The unit session-62994.scope has successfully entered the 'dead' state.
Mar 31 06:07:10 localhost.localdomain dhclient\[1326\]: XMT: Solicit on ens192, interval 110600ms.
Mar 31 06:07:10 localhost.localdomain dhclient\[1326\]: RCV: Advertise message on ens192 from fe80::250:56ff:fe8c:84c6.
Mar 31 06:07:10 localhost.localdomain dhclient\[1326\]: RCV: Advertise message on ens192 from fe80::250:56ff:fe9a:f13a.
Mar 31 06:07:30 localhost.localdomain sshd\[1297516\]: Invalid user sui from 167.99.68.65 port 48488
Mar 31 06:07:30 localhost.localdomain sshd\[1297516\]: pam_unix(sshd:auth): check pass; user unknown
Mar 31 06:07:30 localhost.localdomain sshd\[1297516\]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=167.99.68.65
Mar 31 06:07:32 localhost.localdomain sshd\[1297516\]: Failed password for invalid user sui from 167.99.68.65 port 48488 ssh2
Mar 31 06:07:34 localhost.localdomain sshd\[1297516\]: Received disconnect from 167.99.68.65 port 48488:11: Bye Bye \[preauth\]
Mar 31 06:07:34 localhost.localdomain sshd\[1297516\]: Disconnected from invalid user sui 167.99.68.65 port 48488 \[preauth\]
Mar 31 06:07:44 localhost.localdomain unix_chkpwd\[1297520\]: password check failed for user (root)
Mar 31 06:07:44 localhost.localdomain sshd\[1297518\]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=61.177.173.27 user=root
Mar 31 06:07:46 localhost.localdomain sshd\[1297518\]: Failed password for root from 61.177.173.27 port 58626 ssh2
Mar 31 06:07:46 localhost.localdomain unix_chkpwd\[1297521\]: password check failed for user (root)
\[root#localhost \~\]# ss --listening --tcp --numeric --processes
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:80 0.0.0.0:\* users:(("nginx",pid=1087,fd=10),("nginx",pid=1086,fd=10),("nginx",pid=1084,fd=10))
LISTEN 0 128 0.0.0.0:22 0.0.0.0:\* users:(("sshd",pid=1335,fd=5))
LISTEN 0 128 0.0.0.0:443 0.0.0.0:\* users:(("nginx",pid=1087,fd=11),("nginx",pid=1086,fd=11),("nginx",pid=1084,fd=11))
LISTEN 0 128 \[::\]:22 \[::\]:\* users:(("sshd",pid=1335,fd=7))
LISTEN 0 80 \*:3306 *:* users:(("mysqld",pid=1098,fd=19))
Tried -
apachectl configtest - Result: syntax ok
setenforce 0

docker exec: rpc error: code = 2 desc = oci runtime error: exec failed

every time I try to do:
$ docker exec
I get the error message:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
Session 1 (works like expected):
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
alpine latest baa5d63471ea 7 weeks ago 4.8 MB
hello-world latest c54a2cc56cbb 5 months ago 1.85 kB
$ docker run --rm --name alpine -it alpine sh
/ # pwd
/
Session 2:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7bd39b37aee2 alpine "sh" 22 seconds ago Up 21 seconds alpine
$ docker exec -it alpine sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
$ docker exec -it 7bd39b37aee2 sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:83: executing setns process caused \"exit status 16\""
/var/log/syslog shows some warnings, but I was neither able to understand the root cause not finding matching answers.
Thanks for any hint.
= = = = = = = = = = = = = = = = = = = = = = = = =
$ docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.13.0-rc3
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 4
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
= = =
/var/log/syslog docker restart and steps above
= = =
Dec 13 14:28:09 pb7tt6ts systemd[1]: Stopping Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 14:28:09 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 14:28:09 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:09.291301057+01:00" level=info msg="libcontainerd: new containerd process, pid: 1448"
Dec 13 14:28:10 pb7tt6ts kernel: [25908.125394] audit: type=1400 audit(1481635690.357:28): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=1466 comm="apparmor_parser"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.370364923+01:00" level=info msg="[graphdriver] using prior storage driver: aufs"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.387915069+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388367650+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388465142+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.388508739+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.389419384+01:00" level=info msg="Loading containers: start."
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.397339748+01:00" level=info msg="Firewalld running: false"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.628011070+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.743703578+01:00" level=info msg="Loading containers: done."
Dec 13 14:28:10 pb7tt6ts kernel: [25908.510718] aufs au_opts_verify:1597:dockerd[1462]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808510166+01:00" level=info msg="Daemon has completed initialization"
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.808575966+01:00" level=info msg="Docker daemon" commit=4d92237 graphdriver=aufs version=1.13.0-rc3
Dec 13 14:28:10 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:10.820562161+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 14:28:10 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:10 pb7tt6ts console-kit-daemon[3106]: GLib-CRITICAL: Source ID 226 was not found when attempting to remove it
Dec 13 14:28:16 pb7tt6ts kernel: [25914.206672] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.388393] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts kernel: [25914.492197] aufs au_opts_verify:1597:dockerd[1460]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7320] device (vethff6f844): failed to find device 35 'vethff6f844' with udev
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7340] manager: (vethff6f844): new Veth device (/org/freedesktop/NetworkManager/Devices/46)
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1614]: Could not generate persistent MAC address for vethff6f844: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <warn> [1481635696.7345] device (veth13c2a1d): failed to find device 36 'veth13c2a1d' with udev
Dec 13 14:28:16 pb7tt6ts systemd-udevd[1615]: Could not generate persistent MAC address for veth13c2a1d: No such file or directory
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7417] manager: (veth13c2a1d): new Veth device (/org/freedesktop/NetworkManager/Devices/47)
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509027] device veth13c2a1d entered promiscuous mode
Dec 13 14:28:16 pb7tt6ts kernel: [25914.509240] IPv6: ADDRCONF(NETDEV_UP): veth13c2a1d: link is not ready
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] devices added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7632] device added (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7639] devices added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d)
Dec 13 14:28:16 pb7tt6ts NetworkManager[1343]: <info> [1481635696.7640] device added (path: /sys/devices/virtual/net/veth13c2a1d, iface: veth13c2a1d): no ifupdown configuration found.
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965015836+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965090775+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Dec 13 14:28:16 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:16.965117179+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Dec 13 14:28:17 pb7tt6ts kernel: [25914.808163] eth0: renamed from vethff6f844
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0599] devices removed (path: /sys/devices/virtual/net/vethff6f844, iface: vethff6f844)
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: A new network interface has been detected.
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0600] device (vethff6f844): driver 'veth' does not support carrier detection.
Dec 13 14:28:17 pb7tt6ts acvpnagent[2339]: Function: logInterfaces File: RouteMgr.cpp Line: 2105 Invoked Function: logInterfaces Return Code: 0 (0x00000000) Description: IP Address Interface List: 192.168.178.24 172.17.0.1 9.145.68.34 FE80:0:0:0:D8B4:C1E0:F8E4:DB77 FE80:0:0:0:42:44FF:FEC9:5D85 FE80:0:0:0:60A9:A1FF:FEED:F31C
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0604] device (veth13c2a1d): link connected
Dec 13 14:28:17 pb7tt6ts NetworkManager[1343]: <info> [1481635697.0605] device (docker0): link connected
Dec 13 14:28:17 pb7tt6ts kernel: [25914.823988] IPv6: ADDRCONF(NETDEV_CHANGE): veth13c2a1d: link becomes ready
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824039] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:17 pb7tt6ts kernel: [25914.824061] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:18 pb7tt6ts acvpnagent[2339]: Function: tableCallbackHandler File: RouteMgr.cpp Line: 1723 Invoked Function: recv Return Code: 11 (0x0000000B) Description: unknown
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface veth13c2a1d.IPv6 with address fe80::60a9:a1ff:feed:f31c.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: New relevant interface veth13c2a1d.IPv6 for mDNS.
Dec 13 14:28:18 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::60a9:a1ff:feed:f31c on veth13c2a1d.*.
Dec 13 14:28:32 pb7tt6ts kernel: [25929.850840] docker0: port 1(veth13c2a1d) entered forwarding state
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.704565159+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:36 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:36.705362948+01:00" level=error msg="Handler for POST /v1.25/exec/8a78f29ef71d4c3ab982a8dd7a4a325e280766072dea7337860874a72c42f42c/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.921880770+01:00" level=error msg="Error running exec in container: rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 16\\\"\"\n"
Dec 13 14:28:46 pb7tt6ts dockerd[1436]: time="2016-12-13T14:28:46.922576933+01:00" level=error msg="Handler for POST /v1.25/exec/5ad25668cac553118b8c702f02c69b427436eb67d1488d4170641bcacfdad50b/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
As recommended I reverted to a main version of docker and installed docker-engine 1.12.4
$ docker info
Containers: 2
Running: 1
Paused: 0
Stopped: 1
Images: 3
Server Version: 1.12.4
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 11
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-53-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.487 GiB
Name: pb7tt6ts
ID: YQ4G:ETTP:5VCM:PAJD:F3KB:O7JN:AZOF:VLTI:SKH4:BTSR:KP7D:NXIZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
Furthermore, no success but different error:
$ docker exec -it alpine sh
rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: container_linux.go:247: starting container process caused \"process_linux.go:83: executing setns process caused \\\"exit status 17\\\"\"\n"
Corresponding /var/log/syslog from service docker start (21:00), docker run ... (21:01), docker exec ... (21:01)
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Listening on Docker Socket for the API.
Dec 13 21:00:01 pb7tt6ts systemd[1]: Starting Docker Application Container Engine...
Dec 13 21:00:01 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:01.468921183+01:00" level=info msg="libcontainerd: new containerd process, pid: 8686"
Dec 13 21:00:02 pb7tt6ts kernel: [49419.124965] audit: type=1400 audit(1481659202.536:37): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=8700 comm="apparmor_parser"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.550070413+01:00" level=info msg="[graphdriver] using prior storage driver \"aufs\""
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572067603+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572336166+01:00" level=warning msg="Your kernel does not support swap memory limit."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.572799562+01:00" level=info msg="Loading containers: start."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.579465999+01:00" level=info msg="Firewalld running: false"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.779165187+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903085523+01:00" level=info msg="Loading containers: done."
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903179108+01:00" level=info msg="Daemon has completed initialization"
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.903208197+01:00" level=info msg="Docker daemon" commit=1564f02 graphdriver=aufs version=1.12.4
Dec 13 21:00:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:00:02.923282443+01:00" level=info msg="API listen on /var/run/docker.sock"
Dec 13 21:00:02 pb7tt6ts systemd[1]: Started Docker Application Container Engine.
Dec 13 21:01:01 pb7tt6ts kernel: [49477.834789] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49477.896566] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.080340] aufs au_opts_verify:1597:dockerd[8692]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts kernel: [49478.192100] aufs au_opts_verify:1597:dockerd[8682]: dirperm1 breaks the protection by the permission bits on the lower branch
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6125] device (veth2b5b07c): failed to find device 47 'veth2b5b07c' with udev
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8810]: Could not generate persistent MAC address for vethc2e4873: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.196917] device vethc2e4873 entered promiscuous mode
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6215] manager: (veth2b5b07c): new Veth device (/org/freedesktop/NetworkManager/Devices/63)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <warn> [1481659261.6222] device (vethc2e4873): failed to find device 48 'vethc2e4873' with udev
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6241] manager: (vethc2e4873): new Veth device (/org/freedesktop/NetworkManager/Devices/64)
Dec 13 21:01:01 pb7tt6ts systemd-udevd[8809]: Could not generate persistent MAC address for veth2b5b07c: No such file or directory
Dec 13 21:01:01 pb7tt6ts kernel: [49478.211913] IPv6: ADDRCONF(NETDEV_UP): vethc2e4873: link is not ready
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] devices added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6454] device added (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] devices added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.6507] device added (path: /sys/devices/virtual/net/vethc2e4873, iface: vethc2e4873): no ifupdown configuration found.
Dec 13 21:01:01 pb7tt6ts kernel: [49478.557310] eth0: renamed from veth2b5b07c
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9915] devices removed (path: /sys/devices/virtual/net/veth2b5b07c, iface: veth2b5b07c)
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9916] device (veth2b5b07c): driver 'veth' does not support carrier detection.
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9919] device (vethc2e4873): link connected
Dec 13 21:01:01 pb7tt6ts NetworkManager[1343]: <info> [1481659261.9937] device (docker0): link connected
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573434] IPv6: ADDRCONF(NETDEV_CHANGE): vethc2e4873: link becomes ready
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573503] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:01 pb7tt6ts kernel: [49478.573527] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Joining mDNS multicast group on interface vethc2e4873.IPv6 with address fe80::d02a:ecff:fea8:662c.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: New relevant interface vethc2e4873.IPv6 for mDNS.
Dec 13 21:01:03 pb7tt6ts avahi-daemon[1217]: Registering new address record for fe80::d02a:ecff:fea8:662c on vethc2e4873.*.
Dec 13 21:01:17 pb7tt6ts kernel: [49493.628038] docker0: port 1(vethc2e4873) entered forwarding state
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072027206+01:00" level=error msg="Error running exec in container: rpc error: code = 13 desc = invalid header field value \"oci runtime error: exec failed: container_linux.go:247: starting container process caused \\\"process_linux.go:83: executing setns process caused \\\\\\\"exit status 17\\\\\\\"\\\"\\n\""
Dec 13 21:02:02 pb7tt6ts dockerd[8675]: time="2016-12-13T21:02:02.072759152+01:00" level=error msg="Handler for POST /v1.24/exec/00c0dcac7a178129a17cd9eb833d154d428f2a6efbcd0f421ab3c5c54e52a236/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
From the linked issue is this comment which appears to be the root cause:
I think I found the root reason. It's nothing to do with Docker.
Actually docker exec always fail because of Symantec AutoProtect
running on my system. It loads a custom kernel module that add some
file operation hooks, which affects the result of setns.
$ lsmod | grep symev
symev_custom_dkms_x86_64 72166 2 symap_custom_dkms_x86_64
The workaround is to disable Symantec AutoProtect and reboot.
sudo update-rc.d autoprotect disable

Is there a gcloud API to detect when a Compute Engine server is completely up?

I create an VM instance. I can connect to it as soon ad the SSH Daemon is started. But this is too early because kernel startup is only at approx. 30%. Is there a gcloud or other API to get the VM state when the kernel has finished startup?
Nov 18 10:58:51 image-name google: No startup script found in metadata.
Nov 18 10:58:53 image-name kernel: [ 27.491829] aufs au_opts_verify:1570:docker[2414]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:58:53 image-name kernel: [ 27.703142] aufs au_opts_verify:1570:docker[2414]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:58:53 image-name kernel: [ 27.735867] aufs au_opts_verify:1570:docker[2414]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:58:53 image-name kernel: [ 27.771732] aufs au_opts_verify:1570:docker[2260]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:58:53 image-name kernel: [ 27.797540] device vethfa3ab85 entered promiscuous mode
Nov 18 10:58:53 image-name kernel: [ 27.804420] IPv6: ADDRCONF(NETDEV_UP): vethfa3ab85: link is not ready
Nov 18 10:58:53 image-name kernel: [ 28.028306] IPv6: ADDRCONF(NETDEV_CHANGE): vethfa3ab85: link becomes ready
Nov 18 10:58:53 image-name kernel: [ 28.035505] docker0: port 1(vethfa3ab85) entered forwarding state
Nov 18 10:58:53 image-name kernel: [ 28.041963] docker0: port 1(vethfa3ab85) entered forwarding state
Nov 18 10:58:53 image-name kernel: [ 28.048532] IPv6: ADDRCONF(NETDEV_CHANGE): docker0: link becomes ready
Nov 18 10:58:54 image-name kernel: [ 28.980082] IPv6: eth0: IPv6 duplicate address fe80::42:acff:fe11:1 detected!
->>> about here I can SSH to the server
Nov 18 10:59:08 image-name kernel: [ 43.068094] docker0: port 1(vethfa3ab85) entered forwarding state
Nov 18 10:59:53 image-name kernel: [ 87.944452] aufs au_opts_verify:1570:docker[2864]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:59:53 image-name kernel: [ 88.001012] aufs au_opts_verify:1570:docker[2864]: dirperm1 breaks the protection by the permission bits on the lower branch
Nov 18 10:59:53 image-name kernel: [ 88.049510] aufs au_opts_verify:1570:docker[2815]: dirperm1 breaks the protection by the permission bits on the lower branch
->>> I want to know about this point in the startup process
My problem is that I can connect to it using SSH when kernel progress is below 30% and some processes are not yet started. I want to detect somehow if the server has completed startup. Or is there a script that can push to the server (through the GCE APIs) to notify me when a server is completely up?
gcloud compute instances describe image-name does return the same output from the moment the instance is started till the kernel startup is complete.
(In my case I use the Node.js GCE API, but this should not make any difference.)
Presently I am not aware of any such google native API that can provide a progress of instance start.
However this is a quick workaround check if this fits your requirement.
You can either use the Google Startup script or the native linux rc.local. The concept is the same, so explaining it for the case of rc.local [as it is generic and not tied to google]
We know that the last process in a bootup sequence that runs is rc.local. Any command or script or call that is in this rc.local [which is a sh or bash script by itself] will be executed at the end of boot process.
So the idea would be in the google image in case of rc.local, have a script or a call which send your a notification or writes a output to central system like KV or cloud storage the state that bootup is all done.
Similar to Kamran, but here is how I get this done. It depends on using a google startup script and an image where gcloud is installed by default (though you could rework this to just use curl and API calls)
On instance creation/configuration, I set a custom metadata flag: serverready=False
At the end of my google startup script, I have this:
sudo gcloud compute instances add-metadata $(hostname) \
--metadata serverready=True \
--zone $(curl \
"http://metadata.google.internal/computeMetadata/v1/instance/zone" \
-H "Metadata-Flavor: Google"|cut -d/ -f4)
When I run the instance creation, I can just poll the metadata for the serverready key, and set my app to wait until it sees serverready=True

How to detect Openwrt kern.info and daemon.info events?

My background is mostly Windows programming in C and C++. Recently I've had the chance to work with some embedded Linux systems also, but I'm still new at this.
Right now I'm working on a utility for Openwrt that needs to react to network and system events that occur during normal operation.
I've been able to use Hotplug for some events, but others still elude me. I can parse the output of the system log using logread, but that seems primitive and hackish.
In particular I'd like to get a callback similar to what hotplug does for some of the 'kern.info kernel' and 'daemon.info' events. For example:
Mar 31 19:42:32 OpenWrt kern.info kernel: [ 369.540000] device wlan0 left promiscuous mode
Mar 31 19:42:32 OpenWrt kern.info kernel: [ 369.540000] br-lan: port 2(wlan0) entered disabled state
Mar 31 19:42:32 OpenWrt kern.info kernel: [ 369.730000] device wlan1 left promiscuous mode
Mar 31 19:42:32 OpenWrt kern.info kernel: [ 369.730000] br-lan: port 3(wlan1) entered disabled state
Mar 31 19:42:34 OpenWrt kern.info kernel: [ 371.360000] device wlan0 entered promiscuous mode
Mar 31 19:45:56 OpenWrt daemon.info hostapd: wlan0: STA 04:f7:e4:00:00:00 IEEE 802.11: authenticated
Mar 31 19:45:56 OpenWrt daemon.info hostapd: wlan0: STA 04:f7:e4:00:00:00 IEEE 802.11: associated (aid 1)
Mar 31 19:45:56 OpenWrt daemon.info hostapd: wlan0: STA 04:f7:e4:00:00:00 WPA: pairwise key handshake completed (WPA)
Mar 31 19:45:56 OpenWrt daemon.info hostapd: wlan0: STA 04:f7:e4:00:00:00 WPA: group key handshake completed (WPA)
Mar 31 19:45:56 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPREQUEST(br-lan) 10.1.1.51 04:f7:e4:00:00:00
Mar 31 19:45:56 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPNAK(br-lan) 10.1.1.51 04:f7:e4:00:00:00 wrong network
Mar 31 19:46:00 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPDISCOVER(br-lan) 04:f7:e4:00:00:00
Mar 31 19:46:00 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPOFFER(br-lan) 192.168.1.198 04:f7:e4:1c:09:00
Mar 31 19:46:00 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPDISCOVER(br-lan) 04:f7:e4:1c:09:00
Mar 31 19:46:00 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPOFFER(br-lan) 192.168.1.198 04:f7:e4:1c:09:00
Mar 31 19:46:01 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPREQUEST(br-lan) 192.168.1.198 04:f7:e4:1c:09:00
Mar 31 19:46:01 OpenWrt daemon.info dnsmasq-dhcp[5005]: DHCPACK(br-lan) 192.168.1.198 04:f7:e4:1c:09:00 My-iPhone
Log entries like the DHCPOFFER (as seen in your example) are generated individually by the corresponding process (for example, by udhcpc) using the Unix syslog mechanism (kind-of like the Windows Event Logging API)
By default on OpenWRT logging is handled by the syslogd process provided by the busybox package. This is fairly primitive and simply sends messages to the circular buffer you see using logread and/or to a UDP socket.
You can upgrade logging on OpenWRT to use the syslog-ng package. This has a much more advanced configuration and you should be able to use this to send filtered log events to a script that you can write to do what you need with them.
opkg install syslog-ng
syslog-ng is a GPL product but the documentation is now buried beneath a commercial web site, one would hope you can get it from the source code , via http://freecode.com/projects/syslog-ng. Note that OpenWRT seems to provide version 1.6.12 which I had trouble finding the documentation for when I implemented it on my OpenWRT devices, but eventually I found it via the wayback machine: https://web.archive.org/web/20070406054439/http://www.balabit.com/products/syslog_ng/reference-1.6/syslog-ng.html/x731.html (for example)
A configuration file fragment that would pull out those DHCP messages and send them to a standalone log file would look a bit like:
source src { unix-stream("/dev/log"); internal(); };
destination dhcp_messages { file("/var/log/dhcpmessages"); };
filter f_dhcp { match("dnsmasq-dhcp"); };
log {
source(src);
filter(f_dhcp);
destination(dhcp_messages);
};
You might probably find the pipe() or program() destination drivers the most useful for your application. For example, using a program() driver you could send selected messages to a shell script that parses them and saves them into a sqlite database.

Resources