Kubernetes on Rasperry Pi kube flannel CrashLoopBackOff and kube dns rpc error code = 2 - linux

I used this tutorial to set up a kubernetes cluster on my Raspberry 3.
I followed the instructions until the setup of flannel by:
curl -sSL https://rawgit.com/coreos/flannel/v0.7.0/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
I get the following error message on kubectl get po --all-namespaces:
kube-system etcd-node01 1/1 Running
0 34m
kube-system kube-apiserver-node01 1/1 Running
0 34m
kube-system kube-controller-manager-node01 1/1 Running
0 34m
kube-system kube-dns-279829092-x4dc4 0/3 rpc error:
code = 2 desc = failed to start container
"de9b2094dbada10a0b44df97d25bb629d6fbc96b8ddc0c060bed1d691a308b37":
Error response from daemon: {"message":"cannot join network of a non
running container:
af8e15c6ad67a231b3637c66fab5d835a150da7385fc403efc0a32b8fb7aa165"}
15 39m
kube-system kube-flannel-ds-zk17g 1/2
CrashLoopBackOff
11 35m
kube-system kube-proxy-6zwtb 1/1 Running
0 37m
kube-system kube-proxy-wbmz2 1/1 Running
0 39m
kube-system kube-scheduler-node01 1/1 Running
Interestingly I have the same issue, installing kubernetes with flannel on my laptop with another tutorial.
Version details are here:
Client Version: version.Info{Major:"1", Minor:"6",
GitVersion:"v1.6.3",
GitCommit:"0480917b552be33e2dba47386e51decb1a211df6",
GitTreeState:"clean", BuildDate:"2017-05-10T15:48:59Z",
GoVersion:"go1.8rc2", Compiler:"gc", Platform:"linux/arm"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.3",
GitCommit:"0480917b552be33e2dba47386e51decb1a211df6",
GitTreeState:"clean", BuildDate:"2017-05-10T15:38:08Z",
GoVersion:"go1.8rc2", Compiler:"gc", Platform:"linux/arm"}
Any suggestions, that might help?

I solved this issue by generating cluster-roles before setting up the pod network driver:
curl -sSL https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml | sed "s/amd64/arm/g" | kubectl create -f -
Then setting up the pod network driver by:
curl -sSL https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
Worked for me so far...

Related

Error: forwarding ports: error upgrading connection: error dialing backend: - Azure Kubernetes Service

We have upgraded our Kubernates Service cluster on Azure to latest version 1.12.4. After that we suddenly recognize that pods and nodes cannot communicate between anymore by private ip :
kubectl get pods -o wide -n kube-system -l component=kube-proxy
NAME READY STATUS RESTARTS AGE IP NODE
kube-proxy-bfhbw 1/1 Running 2 16h 10.0.4.4 aks-agentpool-16086733-1
kube-proxy-d7fj9 1/1 Running 2 16h 10.0.4.35 aks-agentpool-16086733-0
kube-proxy-j24th 1/1 Running 2 16h 10.0.4.97 aks-agentpool-16086733-3
kube-proxy-x7ffx 1/1 Running 2 16h 10.0.4.128 aks-agentpool-16086733-4
As you see the node aks-agentpool-16086733-0 has private IP 10.0.4.35 . When we try to check logs on pods which are on this node we got such error:
Get
https://aks-agentpool-16086733-0:10250/containerLogs/emw-sit/nginx-sit-deploy-864b7d7588-bw966/nginx-sit?tailLines=5000&timestamps=true: dial tcp 10.0.4.35:10250: i/o timeout
We got the Tiller ( Helm) on this node as well, and if try to connect to tiller we got such error from Client PC:
shmits-imac:~ andris.shmits01$ helm version Client:
&version.Version{SemVer:"v2.12.3",
GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e",
GitTreeState:"clean"} Error: forwarding ports: error upgrading
connection: error dialing backend: dial tcp 10.0.4.35:10250: i/o
timeout
Does anybody have any idea why the pods and nodes lost connectivity by private IP ?
So , after we scaled down the cluster from 4 nodes to 2 nodes problem disappeared. And after we again scaled up from 2 nodes to 4 everything started working fine
issue could be with apiserver. did you check logs from apiserver pod?
can you run the below command inside cluster. do you 200 OK response?
curl -k -v https://10.96.0.1/version
These issues come when nodes in the Kubernetes cluster created using kubeadm do not get proper Internal IP addresses matching with Nodes/Machines IP.
Issue: If I run helm list command from my cluster then I get below error
helm list
Error: forwarding ports: error upgrading connection: unable to upgrade connection: pod does not exist
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k-master Ready master 3h10m v1.18.5 10.0.0.5 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
k-worker01 Ready <none> 179m v1.18.5 10.0.0.6 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
k-worker02 Ready <none> 167m v1.18.5 10.0.2.15 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
Please note: k-worker02 has internal IP as 10.0.2.15 but I was expecting 10.0.0.7 which is my node/machine IP.
Solution:
Step 1: Connect to Host ( here k-worker02) which does have expected IP
Step 2: open below file
sudo vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Step 3: Edit and append with --node-ip 10.0.0.7
code snippet
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --node-ip 10.0.0.7
Step 4: Reload the daemon and restart the kubelet service
sudo systemctl daemon-reload && sudo systemctl restart kubelet
Result:
kubectl get nodes -o wide
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k-master Ready master 3h36m v1.18.5 10.0.0.5 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
k-worker01 Ready <none> 3h25m v1.18.5 10.0.0.6 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
k-worker02 Ready <none> 3h13m v1.18.5 10.0.0.7 <none> Ubuntu 18.04.3 LTS 4.15.0-58-generic docker://19.3.12
With the above solution, the k-worker02 node has got expected IP (10.0.07) and "forwarding ports:" error stops coming from "helm list or helm install commnad".
Reference: https://networkinferno.net/trouble-with-the-kubernetes-node-ip

Can't do 'helm install' on cluster. Tiller was installed by gitab

I created a cluster in GKE using Gitlab and installed Helm & Tiller and some other stuffs like ingress and gitlab runner using gitab's interface. But when I try to install something using helm from gcloud, it gives "Error: Transport is closing".
I did gcloud container clusters get-credentials ....
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default jaeger-deployment-59ffb979c8-lmjk5 1/1 Running 0 17h
gitlab-managed-apps certmanager-cert-manager-6c8cd9f9bf-67wnh 1/1 Running 0 17h
gitlab-managed-apps ingress-nginx-ingress-controller-75c4d99549-x66n4 1/1 Running 0 21h
gitlab-managed-apps ingress-nginx-ingress-default-backend-6f58fb5f56-pvv2f 1/1 Running 0 21h
gitlab-managed-apps prometheus-kube-state-metrics-6584885ccf-hr8fw 1/1 Running 0 22h
gitlab-managed-apps prometheus-prometheus-server-69b9f444df-htxsq 2/2 Running 0 22h
gitlab-managed-apps runner-gitlab-runner-56798d9d9d-nljqn 1/1 Running 0 22h
gitlab-managed-apps tiller-deploy-74f5d65d77-xk6cc 1/1 Running 0 22h
kube-system heapster-v1.6.0-beta.1-7bdb4fd8f9-t8bq9 2/2 Running 0 22h
kube-system kube-dns-7549f99fcc-bhg9t 4/4 Running 0 22h
kube-system kube-dns-autoscaler-67c97c87fb-4vz9t 1/1 Running 0 22h
kube-system kube-proxy-gke-cluster2-pool-1-05abcbc6-0s6j 1/1 Running 0 20h
kube-system kube-proxy-gke-cluster2-pool-2-67e57524-ht5p 1/1 Running 0 22h
kube-system metrics-server-v0.2.1-fd596d746-289nd 2/2 Running 0 22h
visual-react-10450736 production-847c7d879c-z4h5t 1/1 Running 0 22h
visual-react-10450736 production-postgres-64cfcf9464-jr74c 1/1 Running 0 22h
$ ./helm install stable/wordpress --tiller-namespace gitlab-managed-apps --name wordpress
E0127 10:27:29.790366 418 portforward.go:331] an error occurred forwarding 39113 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 04:57:29 socat[11124] E write(5, 0x14ed120, 186): Broken pipe
Error: transport is closing
Another one
$ ./helm install incubator/jaeger --tiller-namespace gitlab-managed-apps --name jaeger --set elasticsearch.rbac.create=true --set provisionDataStore.cassandra=false --set provisionDataStore.elasticsearch=true --set storage.type=elasticsearch
E0127 10:30:24.591751 429 portforward.go:331] an error occurred forwarding 45597 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 05:00:24 socat[13937] E write(5, 0x233d120, 8192): Connection reset by peer
Error: transport is closing
I tried forwarding ports myself and it never returns to prompt, takes forever.
kubectl port-forward --namespace gitlab-managed-apps tiller-deploy 39113:44134
Apparently installing anything from Gitab's ui uses Helm and those do not fail. Yet doing so from shell fails. Please help me out.
Thanks in advance.
I know it's late but I'll share this just in case someone else struggles with this issue. I've found an answer in the gitlab forums: HERE.
The trick is to export and decode the certificates from the tiller service account and pass them as arguments to helm like this:
helm list --tiller-connection-timeout 30 --tls --tls-ca-cert tiller-ca.crt --tls-cert tiller.crt --tls-key tiller.key ---all --tiller-namespace gitlab-managed-apps

New AKS cluster unreachable via network (including dashboard)

Yesterday I spun up an Azure Kubernetes Service cluster running a few simple apps. Three of them have exposed public IPs that were reachable yesterday.
As of this morning I can't get the dashboard tunnel to work or the LoadBalancer IPs themselves.
I was asked by the Azure twitter account to solicit help here.
I don't know how to troubleshoot this apparent network issue - only az seems to be able to touch my cluster.
dashboard error log
❯❯❯ make dashboard ~/c/azure-k8s (master)
az aks browse --resource-group=akc-rg-cf --name=akc-237
Merged "akc-237" as current context in /var/folders/9r/wx8xx8ls43l8w8b14f6fns8w0000gn/T/tmppst_atlw
Proxy running on http://127.0.0.1:8001/
Press CTRL+C to close the tunnel...
error: error upgrading connection: error dialing backend: dial tcp 10.240.0.4:10250: getsockopt: connection timed out
service+pod listing
❯❯❯ kubectl get services,pods ~/c/azure-k8s (master)
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
azure-vote-back ClusterIP 10.0.125.49 <none> 6379/TCP 16h
azure-vote-front LoadBalancer 10.0.185.4 40.71.248.106 80:31211/TCP 16h
hubot LoadBalancer 10.0.20.218 40.121.215.233 80:31445/TCP 26m
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 19h
mti411-web LoadBalancer 10.0.162.209 52.168.123.30 80:30874/TCP 26m
NAME READY STATUS RESTARTS AGE
azure-vote-back-7556ff9578-sjjn5 1/1 Running 0 2h
azure-vote-front-5b8878fdcd-9lpzx 1/1 Running 0 16h
hubot-74f659b6b8-wctdz 1/1 Running 0 9s
mti411-web-6cc87d46c-g255d 1/1 Running 0 26m
mti411-web-6cc87d46c-lhjzp 1/1 Running 0 26m
http failures
❯❯❯ curl --connect-timeout 2 -I http://40.121.215.233 ~/c/azure-k8s (master)
curl: (28) Connection timed out after 2005 milliseconds
❯❯❯ curl --connect-timeout 2 -I http://52.168.123.30 ~/c/azure-k8s (master)
curl: (28) Connection timed out after 2001 milliseconds
If you are getting getsockopt: connection timed out while trying to access to your AKS Dashboard, I think deleting tunnelfront pod will help as once you delete the tunnelfront pod, this will trigger creation of new tunnelfront by Master. Its something I have tried and worked for me.
#daniel Did rebooting the agent VM's solve your issue or are you still seeing issues?

Performance metrics in Kubernetes Dashboard missing in Azure Kubernetes deployment

Update 2: I was able to get the statistics by using grafana and influxDB. However, I find this overkill. I want to see the current status of my cluster, not persee the historical trends. Based on the linked image, it should be possible by using the pre-deployed Heapster and the Kubernetes Dashboard
Update 1:
With the command below, I do see resource information. I guess the remaining part of the question is why it is not showing up (or how I should configure it to show up) in the kubernetes dashboard, as shown in this image: https://docs.giantswarm.io/img/dashboard-ui.png
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-agentpool0-41204139-0 36m 1% 682Mi 9%
k8s-agentpool0-41204139-1 33m 1% 732Mi 10%
k8s-agentpool0-41204139-10 36m 1% 690Mi 10%
[truncated]
I am trying to monitor performance in my Azure Kubernetes deployment. I noticed it has Heapster running by default. I did not launch this one, but do want to leverage it if it is there. My question is: how can I access it, or is there something wrong with it? Here are the details I can think of, let me know if you need more.
$ kubectl cluster-info
Kubernetes master is running at https://[hidden].uksouth.cloudapp.azure.com
Heapster is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy
tiller-deploy is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/tiller-deploy:tiller/proxy
I set up a proxy:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
Point my browser to
localhost:8001/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/#!/workload?namespace=default
I see the kubernetes dashboard, but do notice that I do not see the performance graphs that are displayed at https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/. I also do not see the admin section.
I then point my browser to localhost:8001/api/v1/namespaces/kube-system/services/heapster/proxy and get
404 page not found
Inspecting the pods:
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
heapster-2205950583-43w4b 2/2 Running 0 1d
kube-addon-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-apiserver-k8s-master-41204139-0 1/1 Running 0 1d
kube-controller-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-dns-v20-2000462293-1j20h 3/3 Running 0 16h
kube-dns-v20-2000462293-hqwfn 3/3 Running 0 16h
kube-proxy-0kwkf 1/1 Running 0 1d
kube-proxy-13bh5 1/1 Running 0 1d
[truncated]
kube-proxy-zfbb1 1/1 Running 0 1d
kube-scheduler-k8s-master-41204139-0 1/1 Running 0 1d
kubernetes-dashboard-732940207-w7pt2 1/1 Running 0 1d
tiller-deploy-3007245560-4tk78 1/1 Running 0 1d
Checking the log:
$kubectl logs heapster-2205950583-43w4b heapster --namespace=kube-system
I0309 06:11:21.241752 19 heapster.go:72] /heapster --source=kubernetes.summary_api:""
I0309 06:11:21.241813 19 heapster.go:73] Heapster version v1.4.2
I0309 06:11:21.242310 19 configs.go:61] Using Kubernetes client with master "https://10.0.0.1:443" and version v1
I0309 06:11:21.242331 19 configs.go:62] Using kubelet port 10255
I0309 06:11:21.243557 19 heapster.go:196] Starting with Metric Sink
I0309 06:11:21.344547 19 heapster.go:106] Starting heapster on port 8082
E0309 14:14:05.000293 19 summary.go:389] Node k8s-agentpool0-41204139-32 is not ready
E0309 14:14:05.000331 19 summary.go:389] Node k8s-agentpool0-41204139-56 is not ready
[truncated the other agent pool messages saying not ready]
E0309 14:24:05.000645 19 summary.go:389] Node k8s-master-41204139-0 is not ready
$kubectl describe pod heapster-2205950583-43w4b --namespace=kube-system
Name: heapster-2205950583-43w4b
Namespace: kube-system
Node: k8s-agentpool0-41204139-54/10.240.0.11
Start Time: Fri, 09 Mar 2018 07:11:15 +0100
Labels: k8s-app=heapster
pod-template-hash=2205950583
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"heapster-2205950583","uid":"ac75e772-2360-11e8-9e1c-00224807...
scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.244.58.2
Controlled By: ReplicaSet/heapster-2205950583
Containers:
heapster:
Container ID: docker://a9205e7ab9070a1d1bdee4a1b93eb47339972ad979c4d35e7d6b59ac15a91817
Image: k8s-gcrio.azureedge.net/heapster-amd64:v1.4.2
Image ID: docker-pullable://k8s-gcrio.azureedge.net/heapster-amd64#sha256:f58ded16b56884eeb73b1ba256bcc489714570bacdeca43d4ba3b91ef9897b20
Port: <none>
Command:
/heapster
--source=kubernetes.summary_api:""
State: Running
Started: Fri, 09 Mar 2018 07:11:20 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 121m
memory: 464Mi
Requests:
cpu: 121m
memory: 464Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
heapster-nanny:
Container ID: docker://68e021532a482f32abec844d6f9ea00a4a8232b8d1004b7df4199d2c7d3a3b4c
Image: k8s-gcrio.azureedge.net/addon-resizer:1.7
Image ID: docker-pullable://k8s-gcrio.azureedge.net/addon-resizer#sha256:dcec9a5c2e20b8df19f3e9eeb87d9054a9e94e71479b935d5cfdbede9ce15895
Port: <none>
Command:
/pod_nanny
--cpu=80m
--extra-cpu=0.5m
--memory=140Mi
--extra-memory=4Mi
--threshold=5
--deployment=heapster
--container=heapster
--poll-period=300000
--estimator=exponential
State: Running
Started: Fri, 09 Mar 2018 07:11:18 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 50m
memory: 90Mi
Requests:
cpu: 50m
memory: 90Mi
Environment:
MY_POD_NAME: heapster-2205950583-43w4b (v1:metadata.name)
MY_POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
heapster-token-txk8b:
Type: Secret (a volume populated by a Secret)
SecretName: heapster-token-txk8b
Optional: false
QoS Class: Guaranteed
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: <none>
Events: <none>
I have seen in the past that if you restart the dashboard pod it starts working. Can you try that real fast and let me know?

kube-dns stays in ContainerCreating status

I have 5 machines running Ubuntu 16.04.1 LTS. I want to set them up as a Kubernetes Cluster. Iḿ trying to follow this getting started guide where they're using kubeadm.
It all worked fine until step 3/4 Installing a pod network. I've looked at there addon page to look for a pod network and chose the flannel overlay network. Iǘe copied the yaml file to the machine and executed:
root#up01:/home/up# kubectl apply -f flannel.yml
Which resulted in:
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
So i thought that it went ok, but when I display all the pod stuff:
root#up01:/etc/kubernetes/manifests# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-d5f50 1/1 Running 0 50m
kube-system etcd-up01 1/1 Running 0 48m
kube-system kube-apiserver-up01 1/1 Running 0 50m
kube-system kube-controller-manager-up01 1/1 Running 0 49m
kube-system kube-discovery-1769846148-jvx53 1/1 Running 0 50m
kube-system kube-dns-2924299975-prlgf 0/4 ContainerCreating 0 49m
kube-system kube-flannel-ds-jb1df 2/2 Running 0 32m
kube-system kube-proxy-rtcht 1/1 Running 0 49m
kube-system kube-scheduler-up01 1/1 Running 0 49m
The problem is that the kube-dns keeps in the ContainerCreating state. I don't know what to do.
It is very likely that you missed this critical piece of information from the guide:
If you want to use flannel as the pod network, specify
--pod-network-cidr 10.244.0.0/16 if you’re using the daemonset manifest below.
If you omit this kube-dns will never leave the ContainerCreating STATUS.
Your kubeadm init command should be:
# kubeadm init --pod-network-cidr 10.244.0.0/16
and not
# kubeadm init
Did you try restarting NetworkManager ...? it worked for me.. Plus, it also worked when I also disabled IPv6.

Resources