kubernetes networking: pod cannot reach nodes - linux

I have kubernetes cluster with 3 masters and 7 workers. I use Calico as cni. When I deploy Calico, the calico-kube-controllers-xxx fails because it cannot reach 10.96.0.1:443.
2020-06-23 13:05:28.737 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0623 13:05:28.740128 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020-06-23 13:05:28.742 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-06-23 13:05:38.742 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-06-23 13:05:38.742 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
this is the situation in the kube-system namespace:
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77d6cbc65f-6bmjg 0/1 CrashLoopBackOff 56 4h33m
calico-node-94pkr 1/1 Running 0 36m
calico-node-d8vc4 1/1 Running 0 36m
calico-node-fgpd4 1/1 Running 0 37m
calico-node-jqgkp 1/1 Running 0 37m
calico-node-m9lds 1/1 Running 0 37m
calico-node-n5qmb 1/1 Running 0 37m
calico-node-t46jb 1/1 Running 0 36m
calico-node-w6xch 1/1 Running 0 38m
calico-node-xpz8k 1/1 Running 0 37m
calico-node-zbw4x 1/1 Running 0 36m
coredns-5644d7b6d9-ms7gv 0/1 Running 0 4h33m
coredns-5644d7b6d9-thwlz 0/1 Running 0 4h33m
kube-apiserver-k8s01 1/1 Running 7 34d
kube-apiserver-k8s02 1/1 Running 9 34d
kube-apiserver-k8s03 1/1 Running 7 34d
kube-controller-manager-k8s01 1/1 Running 7 34d
kube-controller-manager-k8s02 1/1 Running 9 34d
kube-controller-manager-k8s03 1/1 Running 8 34d
kube-proxy-9dppr 1/1 Running 3 4d
kube-proxy-9hhm9 1/1 Running 3 4d
kube-proxy-9svfk 1/1 Running 1 4d
kube-proxy-jctxm 1/1 Running 3 4d
kube-proxy-lsg7m 1/1 Running 3 4d
kube-proxy-m257r 1/1 Running 1 4d
kube-proxy-qtbbz 1/1 Running 2 4d
kube-proxy-v958j 1/1 Running 2 4d
kube-proxy-x97qx 1/1 Running 2 4d
kube-proxy-xjkjl 1/1 Running 3 4d
kube-scheduler-k8s01 1/1 Running 7 34d
kube-scheduler-k8s02 1/1 Running 9 34d
kube-scheduler-k8s03 1/1 Running 8 34d
Besides, also coredns cannot get internal kubernetes service.
Within a node, if I run wget -S 10.96.0.1:443, I receive a response.
wget -S 10.96.0.1:443
--2020-06-23 13:12:12-- http://10.96.0.1:443/
Connecting to 10.96.0.1:443... connected.
HTTP request sent, awaiting response...
HTTP/1.0 400 Bad Request
2020-06-23 13:12:12 ERROR 400: Bad Request.
But, if I run wget -S 10.96.0.1:443 in a pod, I receive a timeout error.
Also, i cannot ping nodes from pods.
Cluster pod cidr is 192.168.0.0/16.

I resolve recreating the cluster with different pod cidr

Related

Rook and ceph on kubernetes

I am new to Kubernetes. I am in need of integrating rook and ceph, adding NFS as block storage. Does anyone have any working examples? I followed https://www.digitalocean.com/community/tutorials/how-to-set-up-a-ceph-cluster-within-kubernetes-using-rook this document and I am getting errors(stuck at container creating, stuck at pod initializing) while creating ceph cluster in rook on Kubernetes. Any help would be appreciated.
kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-provisioner-5bcd46f965-42f9r 0/5 ContainerCreating 0 12m
csi-cephfsplugin-provisioner-5bcd46f965-zszwz 5/5 Running 0 12m
csi-cephfsplugin-xcswb 3/3 Running 0 12m
csi-cephfsplugin-zwl9x 3/3 Running 0 12m
csi-rbdplugin-4mh9x 3/3 Running 0 12m
csi-rbdplugin-nlcjr 3/3 Running 0 12m
csi-rbdplugin-provisioner-6658cf554c-4xx9f 6/6 Running 0 12m
csi-rbdplugin-provisioner-6658cf554c-62xc2 0/6 ContainerCreating 0 12m
rook-ceph-detect-version-bwcmp 0/1 Init:0/1 0 9m18s
rook-ceph-operator-5dc456cdb6-n4tgm 1/1 Running 0 13m
rook-discover-l2r27 1/1 Running 0 13m
rook-discover-rxkv4 0/1 ContainerCreating 0 13m
Use kubectl describe pod <name> -n rook-ceph to see the list of events, it is on the bottom of the output. This will show where the pods get stuck.
It may be also the case that one of your nodes is in bad state, as it seems that some pod replicas are failing to start. You can confirm by running
kubectl get pod -o wide | grep -v Running
Possible all failing pods are running on the same node. If that is the case you can inspect the problematic node with
kubectl describe node [node]

app nodejs in kubernetes cluster dont stay runing - CrashLoopBackOff

I have a small application in nodejs to do tests with kubernetes, but it seems that the application does not keep running
I put all the code that I developed to test, in the GitHub
I'm run kubectl create -f deploy.yaml
Works, but..
[webapp#srvapih ex-node]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-7b89bd4755-4lc6k 1/1 Running 0 5s
api-7b89bd4755-7x964 0/1 ContainerCreating 0 5s
api-7b89bd4755-dv299 1/1 Running 0 5s
api-7b89bd4755-w6tzj 0/1 ContainerCreating 0 5s
api-7b89bd4755-xnm8l 0/1 ContainerCreating 0 5s
[webapp#srvapih ex-node]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-7b89bd4755-4lc6k 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-7x964 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-dv299 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-w6tzj 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-xnm8l 0/1 CrashLoopBackOff 1 11s
Events for describe pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 6m48s (x5 over 8m14s) kubelet, srvweb05.beirario.intranet Container image "node:8-alpine" already present on machine
Normal Created 6m48s (x5 over 8m14s) kubelet, srvweb05.beirario.intranet Created container
Normal Started 6m48s (x5 over 8m12s) kubelet, srvweb05.beirario.intranet Started container
Normal Scheduled 6m9s default-scheduler Successfully assigned default/api-7b89bd4755-4lc6k to srvweb05.beirario.intranet
Warning BackOff 3m2s (x28 over 8m8s) kubelet, srvweb05.beirario.intranet Back-off restarting failed container
All I can say here - you are providing a task that finish with command: ["/bin/sh","-c", "node", "servidor.js"].
Instead of this you should provide command in that way so it never completes.
Describe your pods shows that container in the pod has been completed successfully with exit code 0
Containers:
ex-node:
Container ID: docker://836ffd771b3514fd13ae3e6b8818a7f35807db55cf8f756e962131823a476675
Image: node:8-alpine
Image ID: docker-pullable://node#sha256:8e9987a6d91d783c56980f1bd4b23b4c05f9f6076d513d6350fef8fe09ed01fd
Port: 3000/TCP
Host Port: 0/TCP
Command:
/bin/sh
-c
node
servidor.js
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 08 Mar 2019 14:29:54 +0000
Finished: Fri, 08 Mar 2019 14:29:54 +0000
you may use "process.stdout.write" method in your code ,This will cause the k8s session to be lost. Do not print anything in stdout!
Try to use pm2 https://pm2.io/docs/runtime/integration/docker/. It starts your nodejs app as a background process.

Can't do 'helm install' on cluster. Tiller was installed by gitab

I created a cluster in GKE using Gitlab and installed Helm & Tiller and some other stuffs like ingress and gitlab runner using gitab's interface. But when I try to install something using helm from gcloud, it gives "Error: Transport is closing".
I did gcloud container clusters get-credentials ....
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default jaeger-deployment-59ffb979c8-lmjk5 1/1 Running 0 17h
gitlab-managed-apps certmanager-cert-manager-6c8cd9f9bf-67wnh 1/1 Running 0 17h
gitlab-managed-apps ingress-nginx-ingress-controller-75c4d99549-x66n4 1/1 Running 0 21h
gitlab-managed-apps ingress-nginx-ingress-default-backend-6f58fb5f56-pvv2f 1/1 Running 0 21h
gitlab-managed-apps prometheus-kube-state-metrics-6584885ccf-hr8fw 1/1 Running 0 22h
gitlab-managed-apps prometheus-prometheus-server-69b9f444df-htxsq 2/2 Running 0 22h
gitlab-managed-apps runner-gitlab-runner-56798d9d9d-nljqn 1/1 Running 0 22h
gitlab-managed-apps tiller-deploy-74f5d65d77-xk6cc 1/1 Running 0 22h
kube-system heapster-v1.6.0-beta.1-7bdb4fd8f9-t8bq9 2/2 Running 0 22h
kube-system kube-dns-7549f99fcc-bhg9t 4/4 Running 0 22h
kube-system kube-dns-autoscaler-67c97c87fb-4vz9t 1/1 Running 0 22h
kube-system kube-proxy-gke-cluster2-pool-1-05abcbc6-0s6j 1/1 Running 0 20h
kube-system kube-proxy-gke-cluster2-pool-2-67e57524-ht5p 1/1 Running 0 22h
kube-system metrics-server-v0.2.1-fd596d746-289nd 2/2 Running 0 22h
visual-react-10450736 production-847c7d879c-z4h5t 1/1 Running 0 22h
visual-react-10450736 production-postgres-64cfcf9464-jr74c 1/1 Running 0 22h
$ ./helm install stable/wordpress --tiller-namespace gitlab-managed-apps --name wordpress
E0127 10:27:29.790366 418 portforward.go:331] an error occurred forwarding 39113 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 04:57:29 socat[11124] E write(5, 0x14ed120, 186): Broken pipe
Error: transport is closing
Another one
$ ./helm install incubator/jaeger --tiller-namespace gitlab-managed-apps --name jaeger --set elasticsearch.rbac.create=true --set provisionDataStore.cassandra=false --set provisionDataStore.elasticsearch=true --set storage.type=elasticsearch
E0127 10:30:24.591751 429 portforward.go:331] an error occurred forwarding 45597 -> 44134: error forwarding port 44134 to pod 86b33bdc7bc30c08d98fe44c0772517c344dd1bdfefa290b46e82bf84959cb6f, uid : exit status 1: 2019/01/27 05:00:24 socat[13937] E write(5, 0x233d120, 8192): Connection reset by peer
Error: transport is closing
I tried forwarding ports myself and it never returns to prompt, takes forever.
kubectl port-forward --namespace gitlab-managed-apps tiller-deploy 39113:44134
Apparently installing anything from Gitab's ui uses Helm and those do not fail. Yet doing so from shell fails. Please help me out.
Thanks in advance.
I know it's late but I'll share this just in case someone else struggles with this issue. I've found an answer in the gitlab forums: HERE.
The trick is to export and decode the certificates from the tiller service account and pass them as arguments to helm like this:
helm list --tiller-connection-timeout 30 --tls --tls-ca-cert tiller-ca.crt --tls-cert tiller.crt --tls-key tiller.key ---all --tiller-namespace gitlab-managed-apps

Performance metrics in Kubernetes Dashboard missing in Azure Kubernetes deployment

Update 2: I was able to get the statistics by using grafana and influxDB. However, I find this overkill. I want to see the current status of my cluster, not persee the historical trends. Based on the linked image, it should be possible by using the pre-deployed Heapster and the Kubernetes Dashboard
Update 1:
With the command below, I do see resource information. I guess the remaining part of the question is why it is not showing up (or how I should configure it to show up) in the kubernetes dashboard, as shown in this image: https://docs.giantswarm.io/img/dashboard-ui.png
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-agentpool0-41204139-0 36m 1% 682Mi 9%
k8s-agentpool0-41204139-1 33m 1% 732Mi 10%
k8s-agentpool0-41204139-10 36m 1% 690Mi 10%
[truncated]
I am trying to monitor performance in my Azure Kubernetes deployment. I noticed it has Heapster running by default. I did not launch this one, but do want to leverage it if it is there. My question is: how can I access it, or is there something wrong with it? Here are the details I can think of, let me know if you need more.
$ kubectl cluster-info
Kubernetes master is running at https://[hidden].uksouth.cloudapp.azure.com
Heapster is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy
tiller-deploy is running at https://[hidden].uksouth.cloudapp.azure.com/api/v1/namespaces/kube-system/services/tiller-deploy:tiller/proxy
I set up a proxy:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
Point my browser to
localhost:8001/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/#!/workload?namespace=default
I see the kubernetes dashboard, but do notice that I do not see the performance graphs that are displayed at https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/. I also do not see the admin section.
I then point my browser to localhost:8001/api/v1/namespaces/kube-system/services/heapster/proxy and get
404 page not found
Inspecting the pods:
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
heapster-2205950583-43w4b 2/2 Running 0 1d
kube-addon-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-apiserver-k8s-master-41204139-0 1/1 Running 0 1d
kube-controller-manager-k8s-master-41204139-0 1/1 Running 0 1d
kube-dns-v20-2000462293-1j20h 3/3 Running 0 16h
kube-dns-v20-2000462293-hqwfn 3/3 Running 0 16h
kube-proxy-0kwkf 1/1 Running 0 1d
kube-proxy-13bh5 1/1 Running 0 1d
[truncated]
kube-proxy-zfbb1 1/1 Running 0 1d
kube-scheduler-k8s-master-41204139-0 1/1 Running 0 1d
kubernetes-dashboard-732940207-w7pt2 1/1 Running 0 1d
tiller-deploy-3007245560-4tk78 1/1 Running 0 1d
Checking the log:
$kubectl logs heapster-2205950583-43w4b heapster --namespace=kube-system
I0309 06:11:21.241752 19 heapster.go:72] /heapster --source=kubernetes.summary_api:""
I0309 06:11:21.241813 19 heapster.go:73] Heapster version v1.4.2
I0309 06:11:21.242310 19 configs.go:61] Using Kubernetes client with master "https://10.0.0.1:443" and version v1
I0309 06:11:21.242331 19 configs.go:62] Using kubelet port 10255
I0309 06:11:21.243557 19 heapster.go:196] Starting with Metric Sink
I0309 06:11:21.344547 19 heapster.go:106] Starting heapster on port 8082
E0309 14:14:05.000293 19 summary.go:389] Node k8s-agentpool0-41204139-32 is not ready
E0309 14:14:05.000331 19 summary.go:389] Node k8s-agentpool0-41204139-56 is not ready
[truncated the other agent pool messages saying not ready]
E0309 14:24:05.000645 19 summary.go:389] Node k8s-master-41204139-0 is not ready
$kubectl describe pod heapster-2205950583-43w4b --namespace=kube-system
Name: heapster-2205950583-43w4b
Namespace: kube-system
Node: k8s-agentpool0-41204139-54/10.240.0.11
Start Time: Fri, 09 Mar 2018 07:11:15 +0100
Labels: k8s-app=heapster
pod-template-hash=2205950583
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"heapster-2205950583","uid":"ac75e772-2360-11e8-9e1c-00224807...
scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.244.58.2
Controlled By: ReplicaSet/heapster-2205950583
Containers:
heapster:
Container ID: docker://a9205e7ab9070a1d1bdee4a1b93eb47339972ad979c4d35e7d6b59ac15a91817
Image: k8s-gcrio.azureedge.net/heapster-amd64:v1.4.2
Image ID: docker-pullable://k8s-gcrio.azureedge.net/heapster-amd64#sha256:f58ded16b56884eeb73b1ba256bcc489714570bacdeca43d4ba3b91ef9897b20
Port: <none>
Command:
/heapster
--source=kubernetes.summary_api:""
State: Running
Started: Fri, 09 Mar 2018 07:11:20 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 121m
memory: 464Mi
Requests:
cpu: 121m
memory: 464Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
heapster-nanny:
Container ID: docker://68e021532a482f32abec844d6f9ea00a4a8232b8d1004b7df4199d2c7d3a3b4c
Image: k8s-gcrio.azureedge.net/addon-resizer:1.7
Image ID: docker-pullable://k8s-gcrio.azureedge.net/addon-resizer#sha256:dcec9a5c2e20b8df19f3e9eeb87d9054a9e94e71479b935d5cfdbede9ce15895
Port: <none>
Command:
/pod_nanny
--cpu=80m
--extra-cpu=0.5m
--memory=140Mi
--extra-memory=4Mi
--threshold=5
--deployment=heapster
--container=heapster
--poll-period=300000
--estimator=exponential
State: Running
Started: Fri, 09 Mar 2018 07:11:18 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 50m
memory: 90Mi
Requests:
cpu: 50m
memory: 90Mi
Environment:
MY_POD_NAME: heapster-2205950583-43w4b (v1:metadata.name)
MY_POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from heapster-token-txk8b (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
heapster-token-txk8b:
Type: Secret (a volume populated by a Secret)
SecretName: heapster-token-txk8b
Optional: false
QoS Class: Guaranteed
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: <none>
Events: <none>
I have seen in the past that if you restart the dashboard pod it starts working. Can you try that real fast and let me know?

kube-dns stays in ContainerCreating status

I have 5 machines running Ubuntu 16.04.1 LTS. I want to set them up as a Kubernetes Cluster. Iḿ trying to follow this getting started guide where they're using kubeadm.
It all worked fine until step 3/4 Installing a pod network. I've looked at there addon page to look for a pod network and chose the flannel overlay network. Iǘe copied the yaml file to the machine and executed:
root#up01:/home/up# kubectl apply -f flannel.yml
Which resulted in:
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
So i thought that it went ok, but when I display all the pod stuff:
root#up01:/etc/kubernetes/manifests# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system dummy-2088944543-d5f50 1/1 Running 0 50m
kube-system etcd-up01 1/1 Running 0 48m
kube-system kube-apiserver-up01 1/1 Running 0 50m
kube-system kube-controller-manager-up01 1/1 Running 0 49m
kube-system kube-discovery-1769846148-jvx53 1/1 Running 0 50m
kube-system kube-dns-2924299975-prlgf 0/4 ContainerCreating 0 49m
kube-system kube-flannel-ds-jb1df 2/2 Running 0 32m
kube-system kube-proxy-rtcht 1/1 Running 0 49m
kube-system kube-scheduler-up01 1/1 Running 0 49m
The problem is that the kube-dns keeps in the ContainerCreating state. I don't know what to do.
It is very likely that you missed this critical piece of information from the guide:
If you want to use flannel as the pod network, specify
--pod-network-cidr 10.244.0.0/16 if you’re using the daemonset manifest below.
If you omit this kube-dns will never leave the ContainerCreating STATUS.
Your kubeadm init command should be:
# kubeadm init --pod-network-cidr 10.244.0.0/16
and not
# kubeadm init
Did you try restarting NetworkManager ...? it worked for me.. Plus, it also worked when I also disabled IPv6.

Resources