Can't resolve monitoring-influxdb on Kubernetes with heapster and kube-dns - linux

I am trying to get Heapster working on my Kubernetes cluster. I am using Kube-DNS for DNS resolution.
My Kube-DNS seems to be set up correctly:
kubectl describe pod kube-dns-v20-z2dd2 -n kube-system
Name: kube-dns-v20-z2dd2
Namespace: kube-system
Node: 172.31.48.201/172.31.48.201
Start Time: Mon, 22 Jan 2018 09:21:49 +0000
Labels: k8s-app=kube-dns
version=v20
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status: Running
IP: 172.17.29.4
Controlled By: ReplicationController/kube-dns-v20
Containers:
kubedns:
Container ID: docker://13f95bdf8dee273ca18a2eee1b99fe00e5fff41279776cdef5d7e567472a39dc
Image: gcr.io/google_containers/kubedns-amd64:1.8
Image ID: docker-pullable://gcr.io/google_containers/kubedns-amd64#sha256:39264fd3c998798acdf4fe91c556a6b44f281b6c5797f464f92c3b561c8c808c
Ports: 10053/UDP, 10053/TCP
Args:
--domain=cluster.local.
--dns-port=10053
State: Running
Started: Mon, 22 Jan 2018 09:22:05 +0000
Ready: True
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9zxzd (ro)
dnsmasq:
Container ID: docker://576ebc30e8f7aae13000a2d06541c165a3302376ad04c604b12803463380d9b5
Image: gcr.io/google_containers/kube-dnsmasq-amd64:1.4
Image ID: docker-pullable://gcr.io/google_containers/kube-dnsmasq-amd64#sha256:a722df15c0cf87779aad8ba2468cf072dd208cb5d7cfcaedd90e66b3da9ea9d2
Ports: 53/UDP, 53/TCP
Args:
--cache-size=1000
--no-resolv
--server=127.0.0.1#10053
--log-facility=-
State: Running
Started: Mon, 22 Jan 2018 09:22:20 +0000
Ready: True
Restart Count: 0
Liveness: http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9zxzd (ro)
healthz:
Container ID: docker://3367d05fb0e13c892243a4c86c74a170b0a9a2042387a70f6690ed946afda4d2
Image: gcr.io/google_containers/exechealthz-amd64:1.2
Image ID: docker-pullable://gcr.io/google_containers/exechealthz-amd64#sha256:503e158c3f65ed7399f54010571c7c977ade7fe59010695f48d9650d83488c0a
Port: 8080/TCP
Args:
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
--url=/healthz-dnsmasq
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
--url=/healthz-kubedns
--port=8080
--quiet
State: Running
Started: Mon, 22 Jan 2018 09:22:32 +0000
Ready: True
Restart Count: 0
Limits:
memory: 50Mi
Requests:
cpu: 10m
memory: 50Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9zxzd (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-9zxzd:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-9zxzd
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 43m default-scheduler Successfully assigned kube-dns-v20-z2dd2 to 172.31.48.201
Normal SuccessfulMountVolume 43m kubelet, 172.31.48.201 MountVolume.SetUp succeeded for volume "default-token-9zxzd"
Normal Pulling 43m kubelet, 172.31.48.201 pulling image "gcr.io/google_containers/kubedns-amd64:1.8"
Normal Pulled 43m kubelet, 172.31.48.201 Successfully pulled image "gcr.io/google_containers/kubedns-amd64:1.8"
Normal Created 43m kubelet, 172.31.48.201 Created container
Normal Started 43m kubelet, 172.31.48.201 Started container
Normal Pulling 43m kubelet, 172.31.48.201 pulling image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4"
Normal Pulled 42m kubelet, 172.31.48.201 Successfully pulled image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4"
Normal Created 42m kubelet, 172.31.48.201 Created container
Normal Started 42m kubelet, 172.31.48.201 Started container
Normal Pulling 42m kubelet, 172.31.48.201 pulling image "gcr.io/google_containers/exechealthz-amd64:1.2"
Normal Pulled 42m kubelet, 172.31.48.201 Successfully pulled image "gcr.io/google_containers/exechealthz-amd64:1.2"
Normal Created 42m kubelet, 172.31.48.201 Created container
Normal Started 42m kubelet, 172.31.48.201 Started container
kubectl describe svc kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: <none>
Selector: k8s-app=kube-dns
Type: ClusterIP
IP: 10.254.0.2
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 172.17.29.4:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 172.17.29.4:53
Session Affinity: None
Events: <none>
kubectl describe ep kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: <none>
Subsets:
Addresses: 172.17.29.4
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
dns 53 UDP
dns-tcp 53 TCP
Events: <none>
kubectl exec -it busybox1 -- nslookup kubernetes.default
Server: 10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.254.0.1 kubernetes.default.svc.cluster.local
However if I am trying to resolve http://monitoring-influxdb on either the busybox container (outside the kube-system namespace) it can't get resolved:
kubectl exec -it heapster-v1.2.0-7657f45c77-65w7w --container heapster -n kube-system -- nslookup http://monitoring-influxdb
Server: (null)
Address 1: 127.0.0.1 localhost
Address 2: ::1 localhost
nslookup: can't resolve 'http://monitoring-influxdb': Try again
command terminated with exit code 1
kubectl exec -it heapster-v1.2.0-7657f45c77-65w7w --container heapster -n kube-system -- cat /etc/resolv.conf
nameserver 10.254.0.2
search kube-system.svc.cluster.local svc.cluster.local cluster.local eu-central-1.compute.internal
options ndots:5
kubectl exec -it busybox1 -- nslookup http://monitoring-influxdb
Server: 10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'http://monitoring-influxdb'
command terminated with exit code 1
kubectl exec -it busybox1 -- cat /etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local svc.cluster.local cluster.local eu-central-1.compute.internal
options ndots:5
Finally here are the logs from the heapster pod. I could not find any error in the dns pod logs:
kubectl logs heapster-v1.2.0-7657f45c77-65w7w heapster -n kube-system
E0122 09:22:46.966896 1 influxdb.go:217] issues while creating an InfluxDB sink: failed to ping InfluxDB server at "monitoring-influxdb:8086" - Get http://monitoring-influxdb:8086/ping: dial tcp: lookup monitoring-influxdb on 10.254.0.2:53: server misbehaving, will retry on use
Any pointers are highly appreciated.
EDIT:
The monitoring-influxdb is located in the same namespace as the heapster (kube-system).
kubectl exec -it heapster-v1.2.0-7657f45c77-65w7w --container heapster -n kube-system -- nslookup monitoring-influxdb.kube-system
Server: (null)
Address 1: 127.0.0.1 localhost
Address 2: ::1 localhost
nslookup: can't resolve 'monitoring-influxdb.kube-system': Name does not resolve
command terminated with exit code 1
But for whatever reason busybox is able to resolve the server.
kubectl exec -it busybox1 -- nslookup http://monitoring-influxdb.kube-system
Server: 10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local
Name: monitoring-influxdb.kube-system
Address 1: 10.254.48.109 monitoring-influxdb.kube-system.svc.cluster.local
kubectl -n kube-system get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
heapster ClusterIP 10.254.193.208 <none> 80/TCP 1h
kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP 1h
kubernetes-dashboard NodePort 10.254.89.241 <none> 80:32431/TCP 1h
monitoring-grafana ClusterIP 10.254.176.96 <none> 80/TCP 1h
monitoring-influxdb ClusterIP 10.254.48.109 <none> 8083/TCP,8086/TCP 1h
kubectl -n kube-system get ep
NAME ENDPOINTS AGE
heapster 172.17.29.7:8082 1h
kube-controller-manager <none> 1h
kube-dns 172.17.29.6:53,172.17.29.6:53 1h
kubernetes-dashboard 172.17.29.5:9090 1h
monitoring-grafana 172.17.29.3:3000 1h
monitoring-influxdb 172.17.29.3:8086,172.17.29.3:8083 1h

In kubernetes, you can resolve services by their name alone, but only if you are inside the same namespace.
Services are also reachable through a DNS name in the form:
<service name>.<namespace>
From your question is not clear in which namespace you deployed influxdb, but give the above suggestion a try.

Related

Kubernetes nfs-subdir-external-provisioner stuck in ContainerCreating / Unable to attach or mount volumes: unmounted volumes=[nfs-client-root]

I'm trying to install a nfs-client-provisioner and run a mongdb with it.
Unfortunately, the nfs-client-provisioner hangs in ContainerCreating and says "Warning FailedMount 3m35s (x13 over 37m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[nfs-client-root kube-api-access-lr9tl]: timed out waiting for the condition
".
The nfs server is configured on the same VPS machine (Debian 10).
I am able to mount and write files on the nfs server from a second vps with debian 10.
The cluster is setup with K0s
I have an error with the helm chart and manual installation.
Any help is apechiatet1
For Some more info see below:
Helm version:
version.BuildInfo{Version:"v3.8.2", GitCommit:"6e3701edea09e5d55a8ca2aae03a68917630e91b", GitTreeState:"clean", GoVersion:"go1.17.5"}
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:49:13Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5+k0s", GitCommit:"5ab78974affb1a76f1e5687aaa8b02aeac4380b8", GitTreeState:"clean", BuildDate:"2022-03-24T22:59:27Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
k0s version: v1.23.5+k0s.0
worker added with:
token=$(k0s token create --role=worker)
docker run -d --name k0s-worker1 --hostname k0s-worker1 --privileged -v /var/lib/k0s docker.io/k0sproject/k0s:latest k0s worker $token
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k0s-worker9 Ready <none> 42m v1.23.5+k0s
v2202204173709187201 Ready control-plane 43m v1.23.5+k0s
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
managed-nfs-storage k8s-sigs.io/nfs-subdir-external-provisioner Delete Immediate false 40m
/ect/exports
/data/nfs-storage *(rw,sync,no_root_squash,no_subtree_check,insecure)
Output with
sudo k0s kubectl describe pod nfs-client-provisioner-6889579fdb-t7j74
Name: nfs-client-provisioner-6889579fdb-t7j74
Namespace: default
Priority: 0
Node: k0s-worker9/172.17.0.2
Start Time: Tue, 26 Apr 2022 08:45:49 +0200
Labels: app=nfs-client-provisioner
pod-template-hash=6889579fdb
Annotations: kubernetes.io/psp: 00-k0s-privileged
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/nfs-client-provisioner-6889579fdb
Containers:
nfs-client-provisioner:
Container ID:
Image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
PROVISIONER_NAME: k8s-sigs.io/nfs-subdir-external-provisioner
NFS_SERVER: 47.122.181.39
NFS_PATH: /data/nfs-storage
Mounts:
/persistentvolumes from nfs-client-root (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lr9tl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nfs-client-root:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 47.122.181.39
Path: /data/nfs-storage
ReadOnly: false
kube-api-access-lr9tl:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18m default-scheduler Successfully assigned default/nfs-client-provisioner-6889579fdb-t7j74 to k0s-worker9
Warning FailedMount 2m42s (x6 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[nfs-client-root kube-api-access-lr9tl]: timed out waiting for the condition
Warning FailedMount 24s (x2 over 7m14s) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[kube-api-access-lr9tl nfs-client-root]: timed out waiting for the condition
command using helm:
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=47.122.181.39 \
--set nfs.path=/data/nfs-storage
without helm:
from https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/tree/v4.0.2/deploy
kubectl create -f rbac.yaml
kubectl create -f class.yaml
kubectl create -f deployment.yaml
class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-nfs-storage
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
archiveOnDelete: "false"
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
labels:
app: nfs-client-provisioner
# replace with namespace where provisioner is deployed
namespace: default
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.1
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s-sigs.io/nfs-subdir-external-provisioner
- name: NFS_SERVER
value: 47.122.181.39
- name: NFS_PATH
value: /data/nfs-storage
volumes:
- name: nfs-client-root
nfs:
server: 47.122.181.39
path: /data/nfs-storage
K0s cluster setup:
sudo curl -sSLf https://get.k0s.sh | sudo sh
sudo k0s install controller --enable-worker
sudo k0s start
sudo cp /var/lib/k0s/pki/admin.conf ~/admin.conf
export KUBECONFIG=~/admin.conf
token=$(k0s token create --role=worker)
docker run -d --name k0s-worker9 --hostname k0s-worker9 --privileged -v /var/lib/k0s docker.io/k0sproject/k0s:latest k0s worker $token
Have you tried:
nfs.mountOptions = {
nfsvers = 4
}
I also used this provisioner. It works only with nfs4. And also check mount point of your NFS server if it's root / or not. If your provisioner is configured and ready try to recreate pvc.

MetalLB works only in master Node, cant reach ip assigned from workers

I've sucessfully installed MetalLB on my Bare Metal Kubernetes cluster, but only pods assigned to the master Node seems to work.
MLB is configured on layer2, in the range of 192.168.0.100-192.168.0.200, and pods do get an IP when assigned to worker nodes, but those ips do not respond to any request.
If the assigned ip is curled inside the node, it works, yet if its curled from another node or machine, it doesnt respond.
Example:
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx2-658ffbbcb6-w5w28 1/1 Running 0 4m51s 10.244.1.2 worker2.homelab.com <none> <none>
nginx21-65b87bcbcb-fv856 1/1 Running 0 4h32m 10.244.0.10 master1.homelab.com <none> <none>
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5h49m
nginx2 LoadBalancer 10.111.192.206 192.168.0.111 80:32404/TCP 5h21m
nginx21 LoadBalancer 10.108.222.125 192.168.0.113 80:31387/TCP 4h43m
# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master1.homelab.com Ready control-plane,master 5h50m v1.20.2 192.168.0.20 <none> CentOS Linux 7 (Core) 3.10.0-1160.15.2.el7.x86_64 docker://20.10.3
worker2.homelab.com Ready <none> 10m v1.20.2 192.168.0.22 <none> CentOS Linux 7 (Core) 3.10.0-1160.15.2.el7.x86_64 docker://20.10.3
Deployment nginx2 (Worker2, the one that doest work)
kubectl describe svc nginx2
Name: nginx2
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=nginx2
Type: LoadBalancer
IP: 10.111.192.206
LoadBalancer Ingress: 192.168.0.111
Port: http 80/TCP
TargetPort: 80/TCP
NodePort: http 32404/TCP
Endpoints: 10.244.1.2:80
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal nodeAssigned 10m (x6 over 5h23m) metallb-speaker announcing from node "master1.homelab.com"
Normal nodeAssigned 5m18s metallb-speaker announcing from node "worker2.homelab.com"
[root#worker2 ~]# curl 192.168.0.111
<!DOCTYPE html> ..... (Works)
[root#master1 ~]# curl 192.168.0.111
curl: (7) Failed connect to 192.168.0.111:80; No route to host
Deployment nginx21 (Master1, the one that works)
kubectl describe svc nginx21
Name: nginx21
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=nginx21
Type: LoadBalancer
IP: 10.108.222.125
LoadBalancer Ingress: 192.168.0.113
Port: http 80/TCP
TargetPort: 80/TCP
NodePort: http 31387/TCP
Endpoints: 10.244.0.10:80
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal nodeAssigned 12m (x3 over 4h35m) metallb-speaker announcing from node "master1.homelab.com"
[root#worker2 ~]# curl 192.168.0.113
<!DOCTYPE html> ..... (Works)
[root#master1 ~]# curl 192.168.0.113
<!DOCTYPE html> ..... (Works)
--------- PING WORKS FROM OTHER MACHINES ----------
I've just found out this, so it might be a problem with iptables? i dont really know how it works on MetalLB, i can ping the ip (192.168.0.111) from other machines and it responds
i figured out, after Matt response, it was the firewall that was blocking the access, so i just simply added the whole network to the port 80 and it worked.
[root#worker2 ~]# firewall-cmd --new-zone=kubernetes --permanent
success
[root#worker2 ~]# firewall-cmd --zone=kubernetes --add-source=192.168.0.1/16 --permanent
success
[root#worker2 ~]# firewall-cmd --zone=kubernetes --add-port=80/tcp --permanent
success
[root#worker2 ~]# firewall-cmd --reload

nodeSelector constraint ignored on AKS on mixed node pools (windows/linux)?

Tried installing a basic Nginx ingress using Helm by running the following command:
helm install nginx-ingress --namespace ingress-basic ingress-nginx/ingress-nginx \
--set controller.service.loadBalancerIP='52.232.109.226' \
--set controller.nodeSelector."beta\.kubernetes\.io/os"='linux' \
--set defaultBackend.nodeSelector."beta\.kubernetes\.io/os"='linux' \
--set controller.replicaCount=1 \
--set rbac.create=true
Shortly after installing I noticed the pod was scheduled onto a Windows node instead of a Linux node:
wesley#Azure:~$ kubectl get pods -n ingress-basic -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ingress-ingress-nginx-admission-create-jcp6x 0/1 ContainerCreating 0 18s <none> akswin000002 <none> <none>
wesley#Azure:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-agentpool-59412422-vmss000000 Ready agent 5h32m v1.17.11 10.240.0.4 <none> Ubuntu 16.04.7 LTS 4.15.0-1096-azure docker://19.3.12
aks-linuxpool-59412422-vmss000000 Ready agent 5h32m v1.17.11 10.240.0.128 <none> Ubuntu 16.04.7 LTS 4.15.0-1096-azure docker://19.3.12
akswin000000 Ready agent 5h28m v1.17.11 10.240.0.35 <none> Windows Server 2019 Datacenter 10.0.17763.1397 docker://19.3.11
akswin000001 Ready agent 5h28m v1.17.11 10.240.0.66 <none> Windows Server 2019 Datacenter 10.0.17763.1397 docker://19.3.11
akswin000002 Ready agent 5h28m v1.17.11 10.240.0.97 <none> Windows Server 2019 Datacenter 10.0.17763.1397 docker://19.3.11
Running a describe on the nginx pod revealed that the field Node-selectors remains to be set to <none>.
Name: nginx-ingress-ingress-nginx-admission-create-jcp6x
Namespace: ingress-basic
Priority: 0
PriorityClassName: <none>
Node: akswin000002/10.240.0.97
Start Time: Fri, 16 Oct 2020 20:09:36 +0000
Labels: app.kubernetes.io/component=admission-webhook
app.kubernetes.io/instance=nginx-ingress
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/version=0.40.2
controller-uid=d03091cd-8138-4923-a369-afeca669099c
helm.sh/chart=ingress-nginx-3.7.1
job-name=nginx-ingress-ingress-nginx-admission-create
Annotations: <none>
**Status: Pending**
IP:
Controlled By: Job/nginx-ingress-ingress-nginx-admission-create
Containers:
create:
Container ID:
Image: docker.io/jettech/kube-webhook-certgen:v1.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
create
--host=nginx-ingress-ingress-nginx-controller-admission,nginx-ingress-ingress-nginx-controller-admission.$(POD_NAMESPACE).svc
--namespace=$(POD_NAMESPACE)
--secret-name=nginx-ingress-ingress-nginx-admission
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
POD_NAMESPACE: ingress-basic (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-ingress-nginx-admission-token-8x6ct (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nginx-ingress-ingress-nginx-admission-token-8x6ct:
Type: Secret (a volume populated by a Secret)
SecretName: nginx-ingress-ingress-nginx-admission-token-8x6ct
Optional: false
QoS Class: BestEffort
**Node-Selectors: <none>**
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 5m (x5401 over 2h) kubelet, akswin000002 Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 31s (x5543 over 2h) kubelet, akswin000002 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "nginx-ingress-ingress-nginx-admission-create-jcp6x": Error response from daemon: container a734c23d20338d7fed800752c19f5e94688fd38fe82c9e90fc14533bae90c6bc encountered an error during hcsshim::System::CreateProcess: failure in a Windows system call: The user name or password is incorrect. (0x52e) extra info: {"CommandLine":"cmd /S /C pauseloop.exe","User":"2000","WorkingDirectory":"C:\\","Environment":{"PATH":"c:\\Windows\\System32;c:\\Windows"},"CreateStdInPipe":true,"CreateStdOutPipe":true,"CreateStdErrPipe":true,"ConsoleSize":[0,0]}
I expected the pod to be scheduled onto a Linux node instead. Does anyone have a clue why this is happening? I saw no taints or anything and this is just a newly spin up cluster. The only workaround for now seems to be to scale the windows node back to 0. Install nginx ingress and then scale up the windows nodes again.
Kubernetes version: 1.17.11
Works with below. Added admissionWebhooks.patch.nodeSelector.
https://learn.microsoft.com/en-us/azure/aks/ingress-basic#create-an-ingress-controller
helm install nginx-ingress ingress-nginx/ingress-nginx \
--namespace ingress-basic \
--set controller.replicaCount=2 \
--set controller.nodeSelector."beta\.kubernetes\.io/os"=linux \
--set defaultBackend.nodeSelector."beta\.kubernetes\.io/os"=linux \
--set controller.admissionWebhooks.patch.nodeSelector."beta\.kubernetes\.io/os"=linux

how to solve EADDRINUSE in GKE container?

I am new to containers and using GKE. I used to run my node server app with npm run debug and am trying to do this as well on GKE using the shell of my container. When I log into the shell of myapp container and do this I get:
> api_server#0.0.0 start /usr/src/app
> node src/
events.js:167
throw er; // Unhandled 'error' event
^
Error: listen EADDRINUSE :::8089
Normally I deal with this using something like killall -9 node but when I do this it looks like I am kicked out of my shell and the container is restarted by kubernetes. It seems node is using the port already or something:
netstat -tulpn | grep 8089
tcp 0 0 :::8089 :::* LISTEN 23/node
How can I start my server from the shell?
My config files:
Dockerfile:
FROM node:10-alpine
RUN apk add --update \
libc6-compat
WORKDIR /usr/src/app
COPY package*.json ./
COPY templates-mjml/ templates-mjml/
COPY public/ public/
COPY src/ src/
COPY data/ data/
COPY config/ config/
COPY migrations/ migrations/
ENV NODE_ENV 'development'
ENV PORT '8089'
RUN npm install --development
myapp.yaml:
apiVersion: v1
kind: Service
metadata:
name: myapp
labels:
app: myapp
spec:
ports:
- port: 8089
name: http
selector:
app: myapp
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: gcr.io/myproject-224713/firstapp:v4
ports:
- containerPort: 8089
env:
- name: POSTGRES_DB_HOST
value: 127.0.0.1:5432
- name: POSTGRES_DB_USER
valueFrom:
secretKeyRef:
name: mysecret
key: username
- name: POSTGRES_DB_PASSWORD
valueFrom:
secretKeyRef:
name: mysecret
key: password
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=myproject-224713:europe-west4:mydatabase=tcp:5432",
"-credential_file=/secrets/cloudsql/credentials.json"]
securityContext:
runAsUser: 2
allowPrivilegeEscalation: false
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
---
myrouter.yaml:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: myapp-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: myapp
spec:
hosts:
- "*"
gateways:
- myapp-gateway
http:
- match:
- uri:
prefix: /
route:
- destination:
host: myapp
weight: 100
websocketUpgrade: true
EDIT:
I got following logs:
EDIT 2:
After adding a featherjs health service I get following output for describe:
Name: myapp-95df4dcd6-lptnq
Namespace: default
Node: gke-standard-cluster-1-default-pool-59600833-pcj3/10.164.0.3
Start Time: Wed, 02 Jan 2019 22:08:33 +0100
Labels: app=myapp
pod-template-hash=518908782
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container myapp; cpu request for container cloudsql-proxy
sidecar.istio.io/status:
{"version":"3c9617ff82c9962a58890e4fa987c69ca62487fda71c23f3a2aad1d7bb46c748","initContainers":["istio-init"],"containers":["istio-proxy"]...
Status: Running
IP: 10.44.3.17
Controlled By: ReplicaSet/myapp-95df4dcd6
Init Containers:
istio-init:
Container ID: docker://768b2327c6cfa57b3d25a7029e52ce6a88dec6848e91dd7edcdf9074c91ff270
Image: gcr.io/gke-release/istio/proxy_init:1.0.2-gke.0
Image ID: docker-pullable://gcr.io/gke-release/istio/proxy_init#sha256:e30d47d2f269347a973523d0c5d7540dbf7f87d24aca2737ebc09dbe5be53134
Port: <none>
Host Port: <none>
Args:
-p
15001
-u
1337
-m
REDIRECT
-i
*
-x
-b
8089,
-d
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 02 Jan 2019 22:08:34 +0100
Finished: Wed, 02 Jan 2019 22:08:35 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts: <none>
Containers:
myapp:
Container ID: docker://5566a3e8242ec6755dc2f26872cfb024fab42d5f64aadc3db1258fcb834f8418
Image: gcr.io/myproject-224713/firstapp:v4
Image ID: docker-pullable://gcr.io/myproject-224713/firstapp#sha256:0cbd4fae0b32fa0da5a8e6eb56cb9b86767568d243d4e01b22d332d568717f41
Port: 8089/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 02 Jan 2019 22:09:19 +0100
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 02 Jan 2019 22:08:35 +0100
Finished: Wed, 02 Jan 2019 22:09:19 +0100
Ready: False
Restart Count: 1
Requests:
cpu: 100m
Liveness: http-get http://:8089/health delay=15s timeout=20s period=10s #success=1 #failure=3
Readiness: http-get http://:8089/health delay=5s timeout=5s period=10s #success=1 #failure=3
Environment:
POSTGRES_DB_HOST: 127.0.0.1:5432
POSTGRES_DB_USER: <set to the key 'username' in secret 'mysecret'> Optional: false
POSTGRES_DB_PASSWORD: <set to the key 'password' in secret 'mysecret'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9vtz5 (ro)
cloudsql-proxy:
Container ID: docker://414799a0699abe38c9759f82a77e1a3e06123714576d6d57390eeb07611f9a63
Image: gcr.io/cloudsql-docker/gce-proxy:1.11
Image ID: docker-pullable://gcr.io/cloudsql-docker/gce-proxy#sha256:5c690349ad8041e8b21eaa63cb078cf13188568e0bfac3b5a914da3483079e2b
Port: <none>
Host Port: <none>
Command:
/cloud_sql_proxy
-instances=myproject-224713:europe-west4:osm=tcp:5432
-credential_file=/secrets/cloudsql/credentials.json
State: Running
Started: Wed, 02 Jan 2019 22:08:36 +0100
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/secrets/cloudsql from cloudsql-instance-credentials (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9vtz5 (ro)
istio-proxy:
Container ID: docker://898bc95c6f8bde18814ef01ce499820d545d7ea2d8bf494b0308f06ab419041e
Image: gcr.io/gke-release/istio/proxyv2:1.0.2-gke.0
Image ID: docker-pullable://gcr.io/gke-release/istio/proxyv2#sha256:826ef4469e4f1d4cabd0dc846f9b7de6507b54f5f0d0171430fcd3fb6f5132dc
Port: <none>
Host Port: <none>
Args:
proxy
sidecar
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
myapp
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system:15007
--discoveryRefreshDelay
1s
--zipkinAddress
zipkin.istio-system:9411
--connectTimeout
10s
--statsdUdpAddress
istio-statsd-prom-bridge.istio-system:9125
--proxyAdminPort
15000
--controlPlaneAuthPolicy
NONE
State: Running
Started: Wed, 02 Jan 2019 22:08:36 +0100
Ready: True
Restart Count: 0
Requests:
cpu: 10m
Environment:
POD_NAME: myapp-95df4dcd6-lptnq (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
ISTIO_META_POD_NAME: myapp-95df4dcd6-lptnq (v1:metadata.name)
ISTIO_META_INTERCEPTION_MODE: REDIRECT
Mounts:
/etc/certs/ from istio-certs (ro)
/etc/istio/proxy from istio-envoy (rw)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
cloudsql-instance-credentials:
Type: Secret (a volume populated by a Secret)
SecretName: cloudsql-instance-credentials
Optional: false
default-token-9vtz5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-9vtz5
Optional: false
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
istio-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.default
Optional: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 68s default-scheduler Successfully assigned myapp-95df4dcd6-lptnq to gke-standard-cluster-1-default-pool-59600833-pcj3
Normal SuccessfulMountVolume 68s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 MountVolume.SetUp succeeded for volume "istio-envoy"
Normal SuccessfulMountVolume 68s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 MountVolume.SetUp succeeded for volume "default-token-9vtz5"
Normal SuccessfulMountVolume 68s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 MountVolume.SetUp succeeded for volume "cloudsql-instance-credentials"
Normal SuccessfulMountVolume 68s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 MountVolume.SetUp succeeded for volume "istio-certs"
Normal Pulled 67s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Container image "gcr.io/gke-release/istio/proxy_init:1.0.2-gke.0" already present on machine
Normal Created 67s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Created container
Normal Started 67s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Started container
Normal Pulled 66s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Container image "gcr.io/cloudsql-docker/gce-proxy:1.11" already present on machine
Normal Created 66s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Created container
Normal Started 66s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Started container
Normal Created 65s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Created container
Normal Started 65s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Started container
Normal Pulled 65s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Container image "gcr.io/gke-release/istio/proxyv2:1.0.2-gke.0" already present on machine
Normal Created 65s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Created container
Normal Started 65s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Started container
Warning Unhealthy 31s (x4 over 61s) kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Readiness probe failed: HTTP probe failed with statuscode: 404
Normal Pulled 22s (x2 over 66s) kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Container image "gcr.io/myproject-224713/firstapp:v4" already present on machine
Warning Unhealthy 22s (x3 over 42s) kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Liveness probe failed: HTTP probe failed with statuscode: 404
Normal Killing 22s kubelet, gke-standard-cluster-1-default-pool-59600833-pcj3 Killing container with id docker://myapp:Container failed liveness probe.. Container will be killed and recreated.
This is just how Kubernetes works, as long as your pod has processes running it will remain 'up'. The moment you kill one if its processes Kubernetes will restart the pod because it crashed or something went wrong.
If you really want to debug with npm run debug consider either:
Create a container with the CMD (at the end) or ENTRYPOINT value in your Dockerfile that is npm run debug. Then run it using a Deployment definition in Kubernetes.
Override the command in the myapp container in your deployment definition with something like:
spec:
containers:
- name: myapp
image: gcr.io/myproject-224713/firstapp:v4
ports:
- containerPort: 8089
command: ["npm", "run", "debug" ]
env:
- name: POSTGRES_DB_HOST
value: 127.0.0.1:5432
- name: POSTGRES_DB_USER
valueFrom:
secretKeyRef:
name: mysecret
key: username
- name: POSTGRES_DB_PASSWORD
valueFrom:
secretKeyRef:
name: mysecret
key: password

Cannot access Web API deployed in Azure ACS Kubernetes Cluster

Please help. I am trying to deploy a web API to Azure ACS Kubernetes cluster, it is a simple web API created in VSTS and the result should be like this: { "value1", "value2" }.
I plan to make the type as Cluster-IP but I want to test and access it first that is why this is LoadBalancer, the pods is running and no restart (I think it's good).
The guide I'm following is: Running Web API using Docker and Kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 3d
sampleapi-service LoadBalancer 10.0.238.155 102.51.223.6 80:31676/TCP 1h
When I tried to browse the IP 102.51.223.6/api/values it says:
"This site can’t be reached"
service.yaml
kind: Service
apiVersion: v1
metadata:
name: sampleapi-service
labels:
name: sampleapi
app: sampleapi
spec:
selector:
name: sampleapi
ports:
- protocol: "TCP"
# Port accessible inside the cluster
port: 80
# Port to forwards inside the pod
targetPort: 80
# Port accessible oustide the cluster
#nodePort: 80
type: LoadBalancer
deployment.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: sampleapi-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: sampleapi
spec:
containers:
- name: sampleapi
image: mycontainerregistry.azurecr.io/sampleapi:latest
ports:
- containerPort: 80
POD
Name: sampleapi-deployment-498305766-zzs2z
Namespace: default
Node: c103facs9001/10.240.0.4
Start Time: Fri, 27 Jul 2018 00:20:06 +0000
Labels: app=sampleapi
pod-template-hash=498305766
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"sampleapi-deployme
-498305766","uid":"d064a8e0-9132-11e8-b58d-0...
Status: Running
IP: 10.244.2.223
Controlled By: ReplicaSet/sampleapi-deployment-498305766
Containers:
sampleapi:
Container ID: docker://19d414c87ebafe1cc99d101ac60f1113533e44c24552c75af4ec197d3d3c9c53
Image: mycontainerregistry.azurecr.io/sampleapi:latest
Image ID: docker-pullable://mycontainerregistry.azurecr.io/sampleapi#sha256:9635a9df168ef76a6a27cd46cb15620d762657e9b57a5ac2514ba0b9a8f47a8d
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 27 Jul 2018 00:20:48 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mj5m1 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-mj5m1:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mj5m1
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 50m default-scheduler Successfully assigned sampleapi-deployment-498305766-zzs2z to c103facs9001
Normal SuccessfulMountVolume 50m kubelet, c103facs9001 MountVolume.SetUp succeeded for volume "default-token-mj5m1"
Normal Pulling 49m kubelet, c103facs9001 pulling image "mycontainerregistry.azurecr.io/sampleapi:latest"
Normal Pulled 49m kubelet, c103facs9001 Successfully pulled image "mycontainerregistry.azurecr.io/sampleapi:latest"
Normal Created 49m kubelet, c103facs9001 Created container
Normal Started 49m kubelet, c103facs9001 Started container
It seems like to me that your service isn't set to a port on the container. You have your targetPort commented out. So the service is reachable on port 80 but the service doesn't know to target the pod on that port.
You will need to start the service which exposes the internal port to some external Ip:port that can be used in your browser to access the service. try this after deploying your deployment and service yml files:
kubectl service sampleapi-service

Resources