Error: container "dnsmasq" is unhealthy, it will be killed and re-created while running local cluster in kubernetes - dns

I am running Kubernetes local cluster with using ./hack/local-up-cluster.sh script. Now, when my firewall is off, all the containers in kube-dns are running:
```
# cluster/kubectl.sh get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-dns-73328275-87g4d 3/3 Running 0 45s
```
But when firewall is on, I can see only 2 containers are running:
```
# cluster/kubectl.sh get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-dns-806549836-49v7d 2/3 Running 0 45s
```
After investigating in details, turns out the pod is failing becase dnsmasq container is not running:
```
7m 7m 1 kubelet, 127.0.0.1 spec.containers{dnsmasq} Normal Killing Killing container with id docker://41ef024a0610463e04607665276bb64e07f589e79924e3521708ca73de33142c:pod "kube-dns-806549836-49v7d_kube-system(d5729c5c-24da-11e7-b166-52540083b23a)" container "dnsmasq" is unhealthy, it will be killed and re-created.
```
Can you help me with how do I run dnsmasq container with firewall on, and what exactly would I need to change? TIA.
Turns out my kube-dns service has no endpoints, any idea why that is?

You can turn off iptables( iptables -F ) before starting your cluster, it can solve your problems.

Related

Could not get apiVersions from Kubernetes: Unable to retrieve the complete list of server APIs

While trying to deploy an application got an error as below:
Error: UPGRADE FAILED: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Output of kubectl api-resources consists some resources along with the same error in the end.
Environment: Azure Cloud, AKS Service
Solution:
The steps I followed are:
kubectl get apiservices : If metric-server service is down with the error CrashLoopBackOff try to follow the step 2 otherwise just try to restart the metric-server service using kubectl delete apiservice/"service_name". For me it was v1beta1.metrics.k8s.io .
kubectl get pods -n kube-system and found out that pods like metrics-server, kubernetes-dashboard are down because of the main coreDNS pod was down.
For me it was:
NAME READY STATUS RESTARTS AGE
pod/coredns-85577b65b-zj2x2 0/1 CrashLoopBackOff 7 13m
Use kubectl describe pod/"pod_name" to check the error in coreDNS pod and if it is down because of /etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy, then we need to use forward instead of proxy in the yaml file where coreDNS config is there. Because CoreDNS version 1.5x used by the image does not support the proxy keyword anymore.
This error happens commonly when your metrics server pod is not reachable by the master node. Possible reasons are
metric-server pod is not running. This is the first thing you should check. Then look at the logs of the metric-server pod to check if it has some permission issues trying to get metrics
Try to confirm communication between master and slave nodes.
Try running kubectl top nodes and kubectl top pods -A to see if metric-server runs ok.
From these points you can proceed further.

Not able to see Kubernetes UI Dashboard

I have set up a cluster where there are 2 nodes. One is Master and Other is a node, both on different Azure ubuntu VMs. For networking, I used Canal tool.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu-aniket1 Ready master 57m v1.10.0
ubutu-aniket Ready <none> 56m v1.10.0
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system canal-jztfd 3/3 Running 0 57m
kube-system canal-mdbbp 3/3 Running 0 57m
kube-system etcd-ubuntu-aniket1 1/1 Running 0 58m
kube-system kube-apiserver-ubuntu-aniket1 1/1 Running 0 58m
kube-system kube-controller-manager-ubuntu-aniket1 1/1 Running 0 58m
kube-system kube-dns-86f4d74b45-8zqqr 3/3 Running 0 58m
kube-system kube-proxy-k5ggz 1/1 Running 0 58m
kube-system kube-proxy-vx9sq 1/1 Running 0 57m
kube-system kube-scheduler-ubuntu-aniket1 1/1 Running 0 58m
kube-system kubernetes-dashboard-54865c6fb9-kg5zt 1/1 Running 0 26m
When I tried to create kubernetes Dashboard with
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
and set proxy as
sh
$ kubectl proxy --address 0.0.0.0 --accept-hosts '.*'
Starting to serve on [::]:8001
When I hit url http://<master IP>:8001 in browser, it shows following output
{
"paths": [
"/api",
"/api/v1",
"/apis",
"/apis/",
"/apis/admissionregistration.k8s.io",
"/apis/admissionregistration.k8s.io/v1beta1",
"/apis/apiextensions.k8s.io",
"/apis/apiextensions.k8s.io/v1beta1",
"/apis/apiregistration.k8s.io",
"/apis/apiregistration.k8s.io/v1",
"/apis/apiregistration.k8s.io/v1beta1",
"/apis/apps",
"/apis/apps/v1",
"/apis/apps/v1beta1",
"/apis/apps/v1beta2",
"/apis/authentication.k8s.io",
"/apis/authentication.k8s.io/v1",
"/apis/authentication.k8s.io/v1beta1",
"/apis/authorization.k8s.io",
"/apis/authorization.k8s.io/v1",
"/apis/authorization.k8s.io/v1beta1",
"/apis/autoscaling",
"/apis/autoscaling/v1",
"/apis/autoscaling/v2beta1",
"/apis/batch",
"/apis/batch/v1",
"/apis/batch/v1beta1",
"/apis/certificates.k8s.io",
"/apis/certificates.k8s.io/v1beta1",
"/apis/crd.projectcalico.org",
"/apis/crd.projectcalico.org/v1",
"/apis/events.k8s.io",
"/apis/events.k8s.io/v1beta1",
"/apis/extensions",
"/apis/extensions/v1beta1",
"/apis/networking.k8s.io",
"/apis/networking.k8s.io/v1",
"/apis/policy",
"/apis/policy/v1beta1",
"/apis/rbac.authorization.k8s.io",
"/apis/rbac.authorization.k8s.io/v1",
"/apis/rbac.authorization.k8s.io/v1beta1",
"/apis/storage.k8s.io",
"/apis/storage.k8s.io/v1",
"/apis/storage.k8s.io/v1beta1",
"/healthz",
"/healthz/autoregister-completion",
"/healthz/etcd",
"/healthz/ping",
"/healthz/poststarthook/apiservice-openapi-controller",
"/healthz/poststarthook/apiservice-registration-controller",
"/healthz/poststarthook/apiservice-status-available-controller",
"/healthz/poststarthook/bootstrap-controller",
"/healthz/poststarthook/ca-registration",
"/healthz/poststarthook/generic-apiserver-start-informers",
"/healthz/poststarthook/kube-apiserver-autoregistration",
"/healthz/poststarthook/rbac/bootstrap-roles",
"/healthz/poststarthook/start-apiextensions-controllers",
"/healthz/poststarthook/start-apiextensions-informers",
"/healthz/poststarthook/start-kube-aggregator-informers",
"/healthz/poststarthook/start-kube-apiserver-informers",
"/logs",
"/metrics",
"/openapi/v2",
"/swagger-2.0.0.json",
"/swagger-2.0.0.pb-v1",
"/swagger-2.0.0.pb-v1.gz",
"/swagger.json",
"/swaggerapi",
"/version"
]
}
But when I tries to hit http://<master IP>:8001/ui I am not able to see Kubernetes dashboard. Instead I see following output
{
"paths": [
"/apis",
"/apis/",
"/apis/apiextensions.k8s.io",
"/apis/apiextensions.k8s.io/v1beta1",
"/healthz",
"/healthz/etcd",
"/healthz/ping",
"/healthz/poststarthook/generic-apiserver-start-informers",
"/healthz/poststarthook/start-apiextensions-controllers",
"/healthz/poststarthook/start-apiextensions-informers",
"/metrics",
"/openapi/v2",
"/swagger-2.0.0.json",
"/swagger-2.0.0.pb-v1",
"/swagger-2.0.0.pb-v1.gz",
"/swagger.json",
"/swaggerapi",
"/version"
]
}
Could you please help me resolving dashboard issue?
Thanks in advance
Try go to:
http://<master IP>:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
As mentioned here: https://github.com/kubernetes/dashboard
As mention in kubernetes/dashboard issue 1803:
changes in kubernetes 1.6 users that want to enable RBACs should configure them first to allow dashboard access to api server.
Make sure you have define a service account as in here, to be able to access the dashboard.
See "Service Account Permissions":
Default RBAC policies grant scoped permissions to control-plane components, nodes, and controllers, but grant no permissions to service accounts outside the “kube-system” namespace (beyond discovery permissions given to all authenticated users).
This allows you to grant particular roles to particular service accounts as needed.
Fine-grained role bindings provide greater security, but require more effort to administrate.
Broader grants can give unnecessary (and potentially escalating) API access to service accounts, but are easier to administrate.
I faced same issue when i was creating my self-hosted kubernetes cluster on aws ec2 machines. I troubleshooted in following way and fixed
$ ssh -i ~/.ssh/id_rsa admin#api.example.com (Enter in Master machines from kops installed machine)
$ kubectl proxy --address=0.0.0.0 --port-8001 &
$ ssh -i pemfile username#Ip-address (in machine where you installed kops )
$ cat ~/.kube/config (to get user name and password )
$ kubectl -n kube-system describe secret admin-user-token-id
To get DashBoard
http://MasterIP_address:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

Cannot get kube-dns to start on Kubernetes

Hoping someone can help.
I have a 3x node CoreOS cluster running Kubernetes. The nodes are as follows:
192.168.1.201 - Controller
192.168.1.202 - Worker Node
192.168.1.203 - Worker Node
The cluster is up and running, and I can run the following commands:
> kubectl get nodes
NAME STATUS AGE
192.168.1.201 Ready,SchedulingDisabled 1d
192.168.1.202 Ready 21h
192.168.1.203 Ready 21h
> kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kube-apiserver-192.168.1.201 1/1 Running 2 1d
kube-controller-manager-192.168.1.201 1/1 Running 4 1d
kube-dns-v20-h4w7m 2/3 CrashLoopBackOff 15 23m
kube-proxy-192.168.1.201 1/1 Running 2 1d
kube-proxy-192.168.1.202 1/1 Running 1 21h
kube-proxy-192.168.1.203 1/1 Running 1 21h
kube-scheduler-192.168.1.201 1/1 Running 4 1d
As you can see, the kube-dns service is not running correctly. It keeps restarting and I am struggling to understand why. Any help in debugging this would be greatly appreciated (or pointers at where to read about debugging this. Running kubectl logs does not bring anything back...not sure if the addons function differently to standard pods.
Running a kubectl describe pods, I can see the containers are killed due to being unhealthy:
16m 16m 1 {kubelet 192.168.1.203} spec.containers{kubedns} Normal Created Created container with docker id 189afaa1eb0d; Security:[seccomp=unconfined]
16m 16m 1 {kubelet 192.168.1.203} spec.containers{kubedns} Normal Started Started container with docker id 189afaa1eb0d
14m 14m 1 {kubelet 192.168.1.203} spec.containers{kubedns} Normal Killing Killing container with docker id 189afaa1eb0d: pod "kube-dns-v20-h4w7m_kube-system(3a545c95-ea19-11e6-aa7c-52540021bfab)" container "kubedns" is unhealthy, it will be killed and re-created
Please find a full output of this command as a github gist here: https://gist.github.com/mehstg/0b8016f5398a8781c3ade8cf49c02680
Thanks in advance!
If you installed your cluster with kubeadm you should add a pod network after installing.
If you choose flannel as your pod network, you should have this argument in your init command kubeadm init --pod-network-cidr 10.244.0.0/16.
The flannel YAML file can be found in the coreOS flannel repo.
All you need to do if your cluster was initialized properly (read above), is to run kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Once this is up and running (it will create pods on every node), your kube-dns pod should come up.
If you need to reset your installation (for example to add the argument to kubeadm init), you can use kubeadm reset on all nodes.
Normally, you would run the init command on the master, then add a pod network, and then add your other nodes.
This is all described in more detail in the Getting started guide, step 3/4 regarding the pod network.
as your gist says your pod network seems to be broken. You are using some custom podnetwork with 10.10.10.X. You should communicate this IPs to all components.
Please check, there is no collision with other existing nets.
I recommend you to setup with Calico, as this was the solution for me to bring up CoreOS k8s up working
After followed the steps in the official kubeadm doc with flannel networking, I run into a similar issue
http://janetkuo.github.io/docs/getting-started-guides/kubeadm/
It appears as networking pods get stuck in error states:
kube-dns-xxxxxxxx-xxxvn (rpc error)
kube-flannel-ds-xxxxx (CrashLoopBackOff)
kube-flannel-ds-xxxxx (CrashLoopBackOff)
kube-flannel-ds-xxxxx (CrashLoopBackOff)
In my case it is related to rbac permission errors and is resolved by running
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
Afterwards, all kube-system pods went into running states. The upstream issue is discussed on github https://github.com/kubernetes/kubernetes/issues/44029

External DNS resolution stopped working in Container Engine

I have a simple container on Google Container Engine that has been running for months with no issues. Suddenly, I cannot resolve ANY external domain. In troubleshooting I have re-created the container many times, and upgraded the cluster version to 1.4.7 in an attempt to resolve with no change.
To rule the app code out as much as possible, even a basic node.js code cannot resolve an external domain:
const dns = require('dns');
dns.lookup('nodejs.org', function(err, addresses, family) {
console.log('addresses:', addresses);
});
/* logs 'undefined' */
The same ran on a local machine or local docker container works as expected.
This kubectl call fails as well:
# kubectl exec -ti busybox -- nslookup kubernetes.default
nslookup: can't resolve 'kubernetes.default'
Two show up when getting kube-dns pods (admittedly not sure if that is expected)
# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
kube-dns-v20-v8pd6 3/3 Running 0 1h
kube-dns-v20-vtz4o 3/3 Running 0 1h
Both say this when trying to check for errors in the DNS pod:
# kubectl logs --namespace=kube-system pod/kube-dns-v20-v8pd6 -c kube-dns
Error from server: container kube-dns is not valid for pod kube-dns-v20-v8pd6
I expect the internally created kube-dns is not properly pulling external DNS results or some other linkage disappeared.
I'll accept almost any workaround if one exists, as this is a production app - perhaps it is possible to manually set nameservers in the Kubernetes controller YAML file or elsewhere. Setting the contents of /etc/resolv.conf in Dockerfile does not seem to work.
Just checked and in our own clusters we usually have 3 kube-dns pods so something seems off there.
What does this say: kybectl describe rc kube-dns-v20 --namespace=kube-system
What happens when you kill the kube-dns pods? (the rc should automatically restart them)
What happens when you do an nslookup with a specific nameserver? nslookup nodejs.org 8.8.8.8

Kubernetes Kubelet says that DNS is not set with MissingClusterDNS (SkyDNS)

I've install a Kubernetes 1.2.4 on 3 minons/master (1 master/minion, 2 minions) and installed the SkyDNS addons. After fixing SSL cert problems, I know have SkyDNS working. But kubeletes still says that I didn't set cluster-dns and cluster-domain.
(see edits at the bottom)
But you can see --cluster-dns=192.168.0.10 --cluster-domain=cluster.local:
ps ax | grep kubelet
18717 ? Ssl 0:04 /opt/kubernetes/bin/kubelet --logtostderr=true --v=0 --address=0.0.0.0 --port=10250 --hostname-override=k8s-minion-1 --api-servers=http://k8s-master:8080 --allow-privileged=false --cluster-dns=192.168.0.10 --cluster-domain=cluster.local
Launching this pod:
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always
I see:
kubectl describe pod busybox
7m 7m 2 {kubelet k8s-master.XXX} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
I restarted kubelete services before to launch this pod, and I have no other pod running.
If I launch docker container using "--dns" option:
docker run --rm -it --dns 192.168.0.10 busybox nslookup cluster.local
Server: 192.168.0.10
Address 1: 192.168.0.10
Name: cluster.local
Address 1: 192.168.0.10
Address 2: 172.16.50.2
Address 3: 192.168.0.1
Address 4: 172.16.96.3
docker run --rm -it --dns 192.168.0.10 busybox cat /etc/resolv.conf
search XXX YYYY
nameserver 192.168.0.10
That's absolutly normal (I've hidden my client dns)
But the pod says something else:
kubectl exec busybox -- nslookup cluster.local
Server: XXX.YYY.XXX.YYY
Address 1: XXX.YYYY.XXXX.YYY XXX.domain.fr
nslookup: can't resolve 'cluster.local'
error: error executing remote command: Error executing command in container: Error executing in Docker Container: 1
I tried to set "--dns" option to the docker daemon, but the error is the same.
See that logs:
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kube-dns-v11-osikn 4/4 Running 0 13m
And:
kubectl logs kube-dns-v11-osikn kube2sky --namespace=kube-system
I0621 15:44:48.168080 1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0621 15:44:49.170404 1 kube2sky.go:529] Using https://192.168.0.1:443 for kubernetes master
I0621 15:44:49.170422 1 kube2sky.go:530] Using kubernetes API <nil>
I0621 15:44:49.170823 1 kube2sky.go:598] Waiting for service: default/kubernetes
I0621 15:44:49.209691 1 kube2sky.go:660] Successfully added DNS record for Kubernetes service.
"Using kubernetes API <nil>" is a problem, isn't it ?
edit: I forced kube-master-url in the pod to let kube2sky contacting the master.
kubectl logs kube-dns-v11-osikn skydns --namespace=kube-system
2016/06/21 15:44:50 skydns: falling back to default configuration, could not read from etcd: 100: Key not found (/skydns/config) [10]
2016/06/21 15:44:50 skydns: ready for queries on cluster.local. for tcp://0.0.0.0:53 [rcache 0]
2016/06/21 15:44:50 skydns: ready for queries on cluster.local. for udp://0.0.0.0:53 [rcache 0]
Note this too:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 1/1 Running 0 17m
kube-system kube-dns-v11-osikn 4/4 Running 0 18m
So I've got no problem with skydns.
I'm sure that the problem comes from kubelet, I've tried to remove /var/lib/kubelet and restart the entire cluster. I've tried to restart kubelete services before and after installing dns also. I changed docker configuration, removed "--dns" option afterwoard, and I've got the same behaviour: Docker + dns is ok, Kubelet gives a MissingClusterDNS error saying that kubelet has got no configured cluster dns.
So please... Help (one more time :) )
EDITS:
- now kube2sky doesn't complain about <nil> api version forcing kube2sky option
- I can force nslookup to use my sky DNS:
kubectl exec busybox -- nslookup kubernetes.default.svc.cluster.local 192.168.0.10
Server: 192.168.0.10
Address 1: 192.168.0.10
Name: kubernetes.default.svc.cluster.local
Address 1: 192.168.0.1
But the "MissingClusterDNS" error remains on pod creation, as if kubelet doesn't the startup options "--cluster-dns" and "--cluster-domain"
#Brendan Burns:
kubectl get services --namespace=kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 192.168.0.10 <none> 53/UDP,53/TCP 12m
I finally managed my problem... Shame on me (or not).
I've taken kubelet sources to understand what happends and now I found.
In the "kubelet" file, I set:
KUBE_ARGS="--cluster-dns=10.10.0.10 --cluster-domain=cluster.local"
And the log I added in source says that "cluster-dns" option as this value:
10.10.0.10 --cluster-domain=cluster.local
That's mainly because the config file is interpreted by SystemD as a "bash environment vars" so KUBE_ARGS is "one argument", and it's badly parsed by kubelet service.
The solution is to split variable in two and change kubelet.service file to use vars. After a call to systemctl daemon-reload; systemctl restart kubelet everything is ok.
I opened an issue here: https://github.com/kubernetes/kubernetes/issues/27722 where I explain that the comment in example config file is ambiguous and/or the arguments are not parsed as expected.
Have you created the DNS service with the right IP address?
What does kubectl get services --namespace=kube-system show?

Resources