Configuration Error in Azure IoT Edge installation - "configuration has correct URIs for daemon mgmt endpoint - Error" - azure

Version details
OS: Ubuntu 18.04.5 LTS
aziot-edge: bionic,now 1.2.3-1 amd64
aziot-identity-service: bionic,now 1.2.2-1 amd64
docker: Docker version 20.10.8+azure, build 3967b7d28e15a020e4ee344283128ead633b3e0c
Verifying the installation shows that the aziot-identityd is in "Down-activating" state
# sudo iotedge system status
System services:
aziot-edged Running
aziot-identityd Down - activating
aziot-keyd Running
aziot-certd Running
aziot-tpmd Ready
aziot-identityd is in a bad state because:
aziot-identityd.service: Down - activating : Printing the last 10 log lines.
-- Logs begin at Fri 2020-11-06 12:29:56 IST, end at Fri 2021-09-10 19:07:13 IST. --
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [INFO] - Could not reconcile Identities with current device data. Reprovisioning.
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [INFO] - Updated device info for Edge1.
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [ERR!] - Failed to provision with IoT Hub, and no valid device backup was found: Hub client error
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [ERR!] - service encountered an error
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [ERR!] - caused by: Hub client error
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [ERR!] - caused by: internal error
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 2021-09-10T13:37:10Z [ERR!] - 0: <unknown>
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN aziot-identityd[1871]: 1: <unknown>
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN systemd[1]: aziot-identityd.service: Main process exited, code=exited, status=1/FAILURE
Sep 10 19:07:10 vm-DevIoTEdge1-poc-CentIN systemd[1]: aziot-identityd.service: Failed with result 'exit-code'.
iotedge check shows 2 configuration related errors:
# iotedge check --verbose
Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
√ aziot-identity-service package is up-to-date - OK
√ host time is close to reference time - OK
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ identityd is running - OK
× read all preloaded certificates from the Certificates Service - Error
could not load cert with ID "aziot-edged-trust-bundle"
Caused by:
parameter "id" has an invalid value
caused by: not found
√ read all preloaded key pairs from the Keys Service - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK
Connectivity checks (aziot-identity-service)
--------------------------------------------
√ host can connect to and perform TLS handshake with iothub AMQP port - OK
√ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with iothub MQTT port - OK
Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
× configuration has correct URIs for daemon mgmt endpoint - Error
SocketError - SocketErrorCode (TimedOut) : Operation timed out
One or more errors occurred. (Got bad response: )
caused by: docker returned exit code: 1, stderr = SocketError - SocketErrorCode (TimedOut) : Operation timed out
One or more errors occurred. (Got bad response: )
√ aziot-edge package is up-to-date - OK
√ container time is close to host time - OK
‼ DNS server - Warning
Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
You can ignore this warning if you are setting DNS server per module in the Edge deployment.
caused by: Could not open container engine config file /etc/docker/daemon.json
caused by: No such file or directory (os error 2)
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
Container engine is not configured to rotate module logs which may cause it run out of disk space.
Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
You can ignore this warning if you are setting log policy per module in the Edge deployment.
caused by: Could not open container engine config file /etc/docker/daemon.json
caused by: No such file or directory (os error 2)
× production readiness: Edge Agent's storage directory is persisted on the host filesystem - Error
Could not check current state of edgeAgent container
caused by: docker returned exit code: 1, stderr = Error: No such object: edgeAgent
× production readiness: Edge Hub's storage directory is persisted on the host filesystem - Error
Could not check current state of edgeHub container
caused by: docker returned exit code: 1, stderr = Error: No such object: edgeHub
√ Agent image is valid and can be pulled from upstream - OK
Connectivity checks
-------------------
√ container on the default network can connect to upstream AMQP port - OK
√ container on the default network can connect to upstream HTTPS / WebSockets port - OK
√ container on the default network can connect to upstream MQTT port - OK
√ container on the IoT Edge module network can connect to upstream AMQP port - OK
√ container on the IoT Edge module network can connect to upstream HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to upstream MQTT port - OK
30 check(s) succeeded.
2 check(s) raised warnings.
4 check(s) raised errors.
TOML file has only the manual provisioning with connection string.

I had this error because my IOT Hub networks "Public network access" was set as "Disabled".
You can correct this by going the following:
Go to the Azure portal, and go to the IOT Hub resource in question.
Go to the Networking menu option.
Change the "Public network access" to either "All Networks" or "Selected IP ranges", depending on your use case. Remember if you select "Selected IP ranges", you must add the VM/IOT devices ip address to the list of allowed IP addresses.

I came across this question like too many times while I was working with an enterprise environment. My finding is more related to the environment and security aspect of the whole system.
For my case, my working environment was RedHat Linux and Azure is hosted on-prem with added layer of proxy server. Only one piece of advice to solve most common issues in such environment is to give all necessary permissions of rwx (read, write, all).
Pinpointing the problem asked, the identity daemon is failing because the aziot trust bundle is not loading properly.
read all preloaded certificates from the Certificates Service - Error
could not load cert with ID "aziot-edged-trust-bundle"
Check the certificate is properly setup to use device identity certificate.
Second error is related to daemon management socket:
× configuration has correct URIs for daemon mgmt endpoint - Error
SocketError - SocketErrorCode (TimedOut) : Operation timed out
One or more errors occurred. (Got bad response: )
caused by: docker returned exit code: 1, stderr = SocketError - SocketErrorCode (TimedOut) : Operation timed out
One or more errors occurred. (Got bad response: )
This can be resolved by manually giving ownership permission to mgmt.sock at /var/lib/iotedge location.
Nevertheless, there may be a variety of reasons for iotedge dps to not work and further iotAgent and iotHub to not start. It is better to go to the root of the issue and start resolving it.

Related

Minikube not staring in linux machine

Initially minikube used to run on the same machine.
Now when I am starting the minikube start command it is not starting tried everything but still the same.
Here are the logs of minikube start
[Company\sainath.reddy#hostm repos]$ minikube start
😄 minikube v1.26.0 on Amazon 2 (xen/amd64)
✨ Using the docker driver based on existing profile
👍 Starting control plane node minikube in cluster minikube
🚜 Pulling base image ...
🏃 Updating the running docker "minikube" container ...
❗ This container is having trouble accessing https://k8s.gcr.io
💡 To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
🐳 Preparing Kubernetes v1.24.1 on Docker 20.10.17 ...
▪ kubelet.cgroup-driver=systemd
🤦 Unable to restart cluster, will reset it: apiserver healthz: apiserver process never appeared
▪ Generating certificates and keys ...
▪ Booting up control plane ...
💢 initialization failed, will try again: wait: /bin/bash -c "sudo env PATH="/var/lib/minikube/binaries/v1.24.1:$PATH" kubeadm init --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap,Mem,SystemVerification,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables": Process exited with status 1
stdout:
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.14.285-215.501.amzn2.x86_64
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
CGROUPS_PIDS: enabled
CGROUPS_HUGETLB: enabled
CGROUPS_BLKIO: enabled
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/var/lib/minikube/certs"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID'
stderr:
W0723 18:18:12.089975 49249 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/var/run/cri-dockerd.sock". Please update your configuration!
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: FATAL: Module configs not found in directory /lib/modules/4.14.285-215.501.amzn2.x86_64\n", err: exit status 1
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
▪ Generating certificates and keys ...
▪ Booting up control plane ...
💣 Error starting cluster: wait: /bin/bash -c "sudo env PATH="/var/lib/minikube/binaries/v1.24.1:$PATH" kubeadm init --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap,Mem,SystemVerification,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables": Process exited with status 1
stdout:
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.14.285-215.501.amzn2.x86_64
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
CGROUPS_PIDS: enabled
CGROUPS_HUGETLB: enabled
CGROUPS_BLKIO: enabled
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/var/lib/minikube/certs"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID'
stderr:
W0723 18:22:16.760474 50480 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/var/run/cri-dockerd.sock". Please update your configuration!
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: FATAL: Module configs not found in directory /lib/modules/4.14.285-215.501.amzn2.x86_64\n", err: exit status 1
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
╭───────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 😿 If the above advice does not help, please let us know: │
│ 👉 https://github.com/kubernetes/minikube/issues/new/choose │
│ │
│ Please run `minikube logs --file=logs.txt` and attach logs.txt to the GitHub issue. │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
❌ Exiting due to K8S_KUBELET_NOT_RUNNING: wait: /bin/bash -c "sudo env PATH="/var/lib/minikube/binaries/v1.24.1:$PATH" kubeadm init --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap,Mem,SystemVerification,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables": Process exited with status 1
stdout:
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.14.285-215.501.amzn2.x86_64
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
CGROUPS_PIDS: enabled
CGROUPS_HUGETLB: enabled
CGROUPS_BLKIO: enabled
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/var/lib/minikube/certs"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/cri-dockerd.sock logs CONTAINERID'
stderr:
W0723 18:22:16.760474 50480 initconfiguration.go:120] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/var/run/cri-dockerd.sock". Please update your configuration!
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: FATAL: Module configs not found in directory /lib/modules/4.14.285-215.501.amzn2.x86_64\n", err: exit status 1
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
💡 Suggestion: Check output of 'journalctl -xeu kubelet', try passing --extra-config=kubelet.cgroup-driver=systemd to minikube start
🍿 Related issue: https://github.com/kubernetes/minikube/issues/4172
[Company\sainath.reddy#host repos]$
I have tried minikube start --extra-config=kubelet.cgroup-driver=systemd but still the same.
I am running the minikube on linux machine which is amazon linux 2
Here is the logs when I check kubelet service.
[Company\sainath.reddy#host ~]$ systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2022-07-25 12:28:51 IST; 2s ago
Docs: https://kubernetes.io/docs/
Process: 744434 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 744434 (code=exited, status=1/FAILURE)
[Company\sainath.reddy#host ~]$
Error message is, (you can search for this error keyword K8S_KUBELET_NOT_RUNNING):
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
Kubelet has to be up for running. Check it with these:
systemctl status kubelet
journalctl -xeu kubelet
Steps:
Is systemd installed on your machine?
check it with this:
rpm -qa | grep -i systemd
else install with,
yum install -y /usr/bin/systemctl; systemctl --version
Try updating the cgroup driver in a /etc/docker/daemon.json file (create if it is not there!):
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
Restart the services: (use sudo where needed)
systemctl daemon-reload
systemctl restart docker.service
systemctl restart kubelet
After this if kubelet is running, then issue resolved.
And after this if you face issues with swap, you can swap off with instructions here.

Security handshake failed: {"description":"Handshake read failed"}

What version of gRPC and what language are you using?
#grpc/grpc-js - 1.5.10
What operating system (Linux, Windows,...) and version?
server running in a docker container on azure cloud
What did you do?
I have created a grpc server with SSL. It is a test server, where I use self signed certificates for server. The connection between server and client works fine. But I enabled the debug and trace (tcp, http) logs on the server. I keep getting handshake failed error.
I0427 12:07:40.319067700 18 tcp_server_custom.cc:224] SERVER_CONNECT: 0x7f06409cf3a0 accepted connection: ipv4:10.92.0.9:52824
I0427 12:07:40.319239300 18 tcp_custom.cc:353] Creating TCP endpoint 0x7f0640c78430
I0427 12:07:40.319432800 18 tcp_custom.cc:174] TCP:0x7f0640c78430 read_allocation_done: "No Error"
I0427 12:07:40.319503900 18 tcp_custom.cc:191] Initiating read on 0x7f0640c78430: error="No Error"
I0427 12:07:40.331081600 18 tcp_custom.cc:127] TCP:0x7f0640afea60 call_cb 0x7f0641ed57e0 0x7f0640848b90:0x7f0641ed5610
I0427 12:07:40.331206000 18 tcp_custom.cc:131] read: error={"created":"#1651061260.331064200","description":"EOF","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":106}
D0427 12:07:40.331327300 18 security_handshaker.cc:176] Security handshake failed: {"created":"#1651061260.331311100","description":"Handshake read failed","file":"../deps/grpc/src/core/lib/security/transport/security_handshaker.cc","file_line":357,"referenced_errors":[{"created":"#1651061260.331064200","description":"EOF","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":106}]}
I0427 12:07:40.331412400 18 tcp_custom.cc:287] TCP 0x7f0640afea60 shutdown why={"created":"#1651061260.331311100","description":"Handshake read failed","file":"../deps/grpc/src/core/lib/security/transport/security_handshaker.cc","file_line":357,"referenced_errors":[{"created":"#1651061260.331064200","description":"EOF","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":106}]}
D0427 12:07:40.331443800 18 chttp2_server.cc:122] Handshaking failed: {"created":"#1651061260.331311100","description":"Handshake read failed","file":"../deps/grpc/src/core/lib/security/transport/security_handshaker.cc","file_line":357,"referenced_errors":[{"created":"#1651061260.331064200","description":"EOF","file":"../deps/grpc/src/core/lib/iomgr/tcp_uv.cc","file_line":106}]}
### Anything else we should know about your project / environment?
I have an envoy proxy also running for the grpc server to make grpc-web requests.
Node version: node:14-alpine

Unable to push image to OpenShift internal registry with i/o timeout

Pushing image docker-registry.default.svc:5000/th/th:source ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: serviceaccount#example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: After retrying 6 times, Push image still failed due to error: Get https://docker-registry.default.svc:5000/v1/_ping: dial TCP<ip>:5000: i/o timeout
Manually pushing an image from the CLI to the internal registry is working fine.
I have deployed the OpenShift instance 3.11 on a couple of azure VMs, while deploying I took care of adding external IP to the same.
All other images are also present in the docker registry and the curl command to the docker registry returns with exit code 0
What seemed curious was while deploying my app I tried pinging the registry from the build pods terminal. This resulted in the connection being hung up and no response.
Any ideas on how to fix this?
The sdn was causing this networking issue.
Does Azure support Calico networking?
Calico in VXLAN mode is supported on Azure. However, IPIP packets are
blocked by the Azure network fabric.
The above quote from calico reference was the reason this issue was caused. This could be resolved by changing to VXLAN mode in calico config. More details on how to switch can be found here.
For my solution I just switched to the default openshift sdn 'ovs-subnet' from calico in the inventory file.

One or more errors occurred. (Permission denied /var/run/iotedge/mgmt.sock) caused by: docker returned exit code:

I just installed IoTEdge on Raspberry strech following this:Azure/InternetofThings/IoTEdge/Install or uninstall the Azure IoT Edge runtime
However I get these errors below.
3 weeks I installed another one and it worked perfectly with the same instructions.
pi#raspberrypi:/etc/iotedge $ sudo iotedge check --verbose
Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
√ container engine is installed and functional - OK
√ config.yaml has correct hostname - OK
× config.yaml has correct URIs for daemon mgmt endpoint - Error
One or more errors occurred. (Permission denied /var/run/iotedge/mgmt.sock)
caused by: docker returned exit code: 1, stderr = One or more errors occurred. (Permission denied /var/run/iotedge/mgmt.sock)
√ latest security daemon - OK
√ host time is close to real time - OK
√ container time is close to host time - OK
‼ DNS server - Warning
Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
You can ignore this warning if you are setting DNS server per module in the Edge deployment.
caused by: Could not open container engine config file /etc/docker/daemon.json
caused by: No such file or directory (os error 2)
‼ production readiness: certificates - Warning
The Edge device is using self-signed automatically-generated development certificates.
They will expire in 89 days (at 2021-02-22 07:24:52 UTC) causing module-to-module and downstream device communication to fail on an active deployment.
After the certs have expired, restarting the IoT Edge daemon will trigger it to generate new development certs.
Please consider using production certificates instead. See https://aka.ms/iotedge-prod-checklist-certs for best practices.
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
Container engine is not configured to rotate module logs which may cause it run out of disk space.
Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
You can ignore this warning if you are setting log policy per module in the Edge deployment.
caused by: Could not open container engine config file /etc/docker/daemon.json
caused by: No such file or directory (os error 2)
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
Data might be lost if the module is deleted or updated.
Please see https://aka.ms/iotedge-storage-host for best practices.
× production readiness: Edge Hub's storage directory is persisted on the host filesystem - Error
Could not check current state of edgeHub container
caused by: docker returned exit code: 1, stderr = Error: No such object: edgeHub
Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK
√ container on the default network can connect to IoT Hub AMQP port - OK
√ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the default network can connect to IoT Hub MQTT port - OK
√ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK
√ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK
17 check(s) succeeded.
4 check(s) raised warnings.
2 check(s) raised errors.
iotedge list
pi#raspberrypi:~ $ sudo iotedge list
A module runtime error occurred
caused by: Could not list modules
caused by: connection error: Connection reset by peer (os error 104)

How to connect Mist to the private blockchain on remote server (Azure)?

I've installed Mist on my local PC (Windows 10), but I don't want to sync Main/Test networks. So I've used this Ethereum + Azure tutorial and now I can work via SSH on my private network.
geth --dev console
More than that, I know that it's possible to run Mist on custom blockchain using special flag
mist.exe --rpc http://YOUR_IP:PORT
So, according to geth --help, I'm running geth --dev --rpc console on Azure's virtual machine, after that I'm running mist.exe --rpc http://VM_IP:8545 and there is an error:
[2016-09-24 18:01:21.928] [INFO] Sockets/node-ipc - Connect to {"hostPort":"http://VM_IP:8545"}
[2016-09-24 18:01:24.968] [ERROR] Sockets/node-ipc - Connection failed (3000ms elapsed)
[2016-09-24 18:01:24.971] [WARN] EthereumNode - Failed to connect to node. Maybe it's not running so let's start our own...
[2016-09-24 18:01:24.979] [INFO] EthereumNode - Node type: geth
[2016-09-24 18:01:24.982] [INFO] EthereumNode - Network: test
[2016-09-24 18:01:24.983] [INFO] EthereumNode - Start node: geth test
[2016-09-24 18:01:32.284] [INFO] EthereumNode - 3000ms elapsed, assuming node started up successfully
[2016-09-24 18:01:32.286] [INFO] EthereumNode - Started node successfully: geth test
[2016-09-24 18:01:32.327] [INFO] Sockets/node-ipc - Connect to {"hostPort":"http://VM_IP:8545"}
[2016-09-24 18:02:02.332] [ERROR] Sockets/node-ipc - Connection failed (30000ms elapsed)
[2016-09-24 18:02:02.333] [ERROR] EthereumNode - Failed to connect to node Error: Unable to connect to socket: timeout
P.S. Mist version - 0.8.2
Your approach is correct. I would say that you have a network configuration issue that prevents your Mist to talk to geth.
I would suggest doing the following test and see if you run into the same issue:
- on the machine where you have Mist, find the geth.exe executable
- run geth with geth --testnet --rpc
- start mist with ./Mist --rpc /.../Ethereum/testnet/geth.ipc or ./Mist --rpc http://localhost:8545
I am on a Mac so I guess you will have to reverse the / and add some C: decorations here and there.

Resources