Why pod terminate it self? - azure

i am trying to install fluend with elasticsearch and kibana using bitnami helm chat.
I am following below mention article
Integrate Logging Kubernetes Kibana ElasticSearch Fluentd
But when I deploy the elasticsearch it's pod goes on Terminating or Back-off state.
I am stuck on this from 3 days, any help is appreciated.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 41m (x2 over 41m) default-scheduler error while running "VolumeBinding" filter plugin for pod "elasticsearch-master-0": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled 41m default-scheduler Successfully assigned default/elasticsearch-master-0 to minikube
Normal Pulling 41m kubelet, minikube Pulling image "busybox:latest"
Normal Pulled 41m kubelet, minikube Successfully pulled image "busybox:latest"
Normal Created 41m kubelet, minikube Created container sysctl
Normal Started 41m kubelet, minikube Started container sysctl
Normal Pulling 41m kubelet, minikube Pulling image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6"
Normal Pulled 39m kubelet, minikube Successfully pulled image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6"
Normal Created 39m kubelet, minikube Created container chown
Normal Started 39m kubelet, minikube Started container chown
Normal Created 38m kubelet, minikube Created container elasticsearch
Normal Started 38m kubelet, minikube Started container elasticsearch
Warning Unhealthy 38m kubelet, minikube Readiness probe failed: Get http://172.17.0.7:9200/_cluster/health?local=true: dial tcp 172.17.0.7:9200: connect: connection refused
Normal Pulled 38m (x2 over 38m) kubelet, minikube Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6" already present on machine
Warning FailedMount 32m kubelet, minikube MountVolume.SetUp failed for volume "config" : failed to sync configmap cache: timed out waiting for the condition
Normal SandboxChanged 32m kubelet, minikube Pod sandbox changed, it will be killed and re-created.
Normal Pulling 32m kubelet, minikube Pulling image "busybox:latest"
Normal Pulled 32m kubelet, minikube Successfully pulled image "busybox:latest"
Normal Created 32m kubelet, minikube Created container sysctl
Normal Started 32m kubelet, minikube Started container sysctl
Normal Pulled 32m kubelet, minikube Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6" already present on machine
Normal Created 32m kubelet, minikube Created container chown
Normal Started 32m kubelet, minikube Started container chown
Normal Pulled 32m (x2 over 32m) kubelet, minikube Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6" already present on machine
Normal Created 32m (x2 over 32m) kubelet, minikube Created container elasticsearch
Normal Started 32m (x2 over 32m) kubelet, minikube Started container elasticsearch
Warning Unhealthy 32m kubelet, minikube Readiness probe failed: Get http://172.17.0.6:9200/_cluster/health?local=true: dial tcp 172.17.0.6:9200: connect: connection refused
Warning BackOff 32m (x2 over 32m) kubelet, minikube Back-off restarting failed container

The issue here is the pod has unbound immediate PersistentVolumeClaims. You can set master.persistence.enabled to false while using helm to deploy it. Alternatively you need check if a default storage class exists in the cluster and if it doesn't then create a storage class and make it default.

Short answer: it crashed. You can check the Pod status object for some details like exit status and if was an oomkill and then look at the container logs to see if they show anything.

Related

AKS cannot pull docker image from private registry with letsencryptcertificate

I am gettix x509 certificate issue when AKS is trying to pull docker image from my private repository secured with LetsEncrypt certificate. How can I menage certificate store in AKS to add CA of my certificate etc.
Normal Scheduled 8m8s default-scheduler Successfully assigned default/proxy-deployment-568646f8d4-7gnnt to aks-default-26787434-vmss000000
Normal Pulling 6m34s (x4 over 8m7s) kubelet Pulling image "my registry/my-image:lts"
Warning Failed 6m34s (x4 over 8m7s) kubelet Failed to pull image "my registry/my-image:lts": rpc error: code = Unknown desc = Error response from daemon: Get https://my registry/v2/: x509: certificate signed by unknown authority
Warning Failed 6m34s (x4 over 8m7s) kubelet Error: ErrImagePull
Normal BackOff 6m18s (x6 over 8m7s) kubelet Back-off pulling image "my registry/my-image:lts"
Warning Failed 3m5s (x19 over 8m7s) kubelet Error: ImagePullBackOff

NodeJs api container crashing in kubernetes

As part of the CICD pipeline I deploy my web api to kubernetes, the most recent branch I'm working on keeps crashing.
I have made sure the app runs locally for all the configurations, also the CICD pipeline on the master branch succeeds. I'm assuming is some change I introduced is making the app fail but I can't see any problem on the logs.
This is my DOCKERFILE
FROM node:12
WORKDIR /usr/src/app
ARG NODE_ENV
ENV NODE_ENV $NODE_ENV
COPY package.json /usr/src/app/
RUN npm install
COPY . /usr/src/app
ENV PORT 5000
EXPOSE $PORT
CMD [ "npm", "start" ]
this is what I get when I run kubectl describe on the corresponding pod
Controlled By: ReplicaSet/review-refactor-e-0jmik1-7f75c45779
Containers:
auto-deploy-app:
Container ID: docker://8d6035b8ee0938262ea50e2f74d3ab627761fdf5b1811460b24f94a74f880810
Image: registry.gitlab.com/hidden-fox/metadata-service/refactor-endpoints:5e986c65d41743d9d6e6ede441a1cae316b3e751
Image ID: docker-pullable://registry.gitlab.com/hidden-fox/metadata-service/refactor-endpoints#sha256:de1e4478867f54a76f1c82374dcebb1d40b3eb0cde24caf936a21a4d16471312
Port: 5000/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 27 Jul 2019 19:18:07 +0100
Finished: Sat, 27 Jul 2019 19:18:49 +0100
Ready: False
Restart Count: 7
Liveness: http-get http://:5000/ delay=15s timeout=15s period=10s #success=1 #failure=3
Readiness: http-get http://:5000/ delay=5s timeout=3s period=10s #success=1 #failure=3
Environment Variables from:
review-refactor-e-0jmik1-secret Secret Optional: false
Environment:
DATABASE_URL: postgres://:#review-refactor-e-0jmik1-postgres:5432/
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mvvfv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-mvvfv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mvvfv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m52s default-scheduler Successfully assigned metadata-service-13359548/review-refactor-e-0jmik1-7f75c45779-jfw22 to gke-qa2-default-pool-4dc045be-g8d9
Normal Pulling 9m51s kubelet, gke-qa2-default-pool-4dc045be-g8d9 pulling image "registry.gitlab.com/hidden-fox/metadata-service/refactor-endpoints:5e986c65d41743d9d6e6ede441a1cae316b3e751"
Normal Pulled 9m45s kubelet, gke-qa2-default-pool-4dc045be-g8d9 Successfully pulled image "registry.gitlab.com/hidden-fox/metadata-service/refactor-endpoints:5e986c65d41743d9d6e6ede441a1cae316b3e751"
Warning Unhealthy 8m58s kubelet, gke-qa2-default-pool-4dc045be-g8d9 Readiness probe failed: Get http://10.48.1.34:5000/: dial tcp 10.48.1.34:5000: connect: connection refused
Warning Unhealthy 8m28s (x6 over 9m28s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Readiness probe failed: HTTP probe failed with statuscode: 404
Normal Started 8m23s (x3 over 9m42s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Started container
Warning Unhealthy 8m23s (x6 over 9m23s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Liveness probe failed: HTTP probe failed with statuscode: 404
Normal Killing 8m23s (x2 over 9m3s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Killing container with id docker://auto-deploy-app:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 8m23s (x2 over 9m3s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Container image "registry.gitlab.com/hidden-fox/metadata-service/refactor-endpoints:5e986c65d41743d9d6e6ede441a1cae316b3e751" already present on machine
Normal Created 8m23s (x3 over 9m43s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Created container
Warning BackOff 4m42s (x7 over 5m43s) kubelet, gke-qa2-default-pool-4dc045be-g8d9 Back-off restarting failed container
I expect the app to get deployed to kubernetes but instead I see a CrashLoopBackOff error on kubernetes.
I also don't see any application specific errors in the logs.
I figured it out. I had to add an endpoint mapped to the root url, apparently as part of the CD it gets ping and if there is no response then the job fails.

app nodejs in kubernetes cluster dont stay runing - CrashLoopBackOff

I have a small application in nodejs to do tests with kubernetes, but it seems that the application does not keep running
I put all the code that I developed to test, in the GitHub
I'm run kubectl create -f deploy.yaml
Works, but..
[webapp#srvapih ex-node]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-7b89bd4755-4lc6k 1/1 Running 0 5s
api-7b89bd4755-7x964 0/1 ContainerCreating 0 5s
api-7b89bd4755-dv299 1/1 Running 0 5s
api-7b89bd4755-w6tzj 0/1 ContainerCreating 0 5s
api-7b89bd4755-xnm8l 0/1 ContainerCreating 0 5s
[webapp#srvapih ex-node]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-7b89bd4755-4lc6k 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-7x964 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-dv299 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-w6tzj 0/1 CrashLoopBackOff 1 11s
api-7b89bd4755-xnm8l 0/1 CrashLoopBackOff 1 11s
Events for describe pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 6m48s (x5 over 8m14s) kubelet, srvweb05.beirario.intranet Container image "node:8-alpine" already present on machine
Normal Created 6m48s (x5 over 8m14s) kubelet, srvweb05.beirario.intranet Created container
Normal Started 6m48s (x5 over 8m12s) kubelet, srvweb05.beirario.intranet Started container
Normal Scheduled 6m9s default-scheduler Successfully assigned default/api-7b89bd4755-4lc6k to srvweb05.beirario.intranet
Warning BackOff 3m2s (x28 over 8m8s) kubelet, srvweb05.beirario.intranet Back-off restarting failed container
All I can say here - you are providing a task that finish with command: ["/bin/sh","-c", "node", "servidor.js"].
Instead of this you should provide command in that way so it never completes.
Describe your pods shows that container in the pod has been completed successfully with exit code 0
Containers:
ex-node:
Container ID: docker://836ffd771b3514fd13ae3e6b8818a7f35807db55cf8f756e962131823a476675
Image: node:8-alpine
Image ID: docker-pullable://node#sha256:8e9987a6d91d783c56980f1bd4b23b4c05f9f6076d513d6350fef8fe09ed01fd
Port: 3000/TCP
Host Port: 0/TCP
Command:
/bin/sh
-c
node
servidor.js
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 08 Mar 2019 14:29:54 +0000
Finished: Fri, 08 Mar 2019 14:29:54 +0000
you may use "process.stdout.write" method in your code ,This will cause the k8s session to be lost. Do not print anything in stdout!
Try to use pm2 https://pm2.io/docs/runtime/integration/docker/. It starts your nodejs app as a background process.

test image from azure container registry

I created a simple Docker image from a "Hello World" java application.
This is my Dockerfile
FROM java:8
COPY . /var/www/java
WORKDIR /var/www/java
RUN javac HelloWorld.java
CMD ["java", "HelloWorld"]
I pushed the image (java-app) to Azure Container Registry.
$ az acr repository list --name AContainerRegistry --output tableResult
----------------
java-app
I want to deploy it
amhg$ kubectl run dockerproject --image=acontainerregistry.azurecr.io/java-app:v1
deployment.apps "dockerproject" created
amhg$ kubectl expose deployments dockerproject --port=80 --type=LoadBalancer
service "dockerproject" exposed
and see the pods, the pod is crashed
amhg$ kubectl get pods
NAME READY STATUS RESTARTS AGE
dockerproject-b6799d879-pt5rx 0/1 CrashLoopBackOff 8 19m
Is there a way to "test"/run the image from the central registry, how come it crashes?
HERE DESCRIBE POD
amhg$ kubectl describe pod dockerproject-64fbf7649-spc7h
Name: dockerproject-64fbf7649-spc7h
Namespace: default
Node: aks-nodepool1-39744669-0/10.240.0.4
Start Time: Thu, 19 Apr 2018 11:53:58 +0200
Labels: pod-template-hash=209693205
run=dockerproject
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"dockerproject-64fbf7649","uid":"946610e4-43b7-11e8-9537-0a58ac1...
Status: Running
IP: 10.244.0.38
Controlled By: ReplicaSet/dockerproject-64fbf7649
Containers:
dockerproject:
Container ID: docker://1f2a7a6870a37e4d6b53fc834b0d4d3b681e9faaacc3772177a918e66357404e
Image: acontainerregistry.azurecr.io/java-app:v1
Image ID: docker-pullable://acontainerregistry.azurecr.io/java-app#sha256:eaf6fe53a59de287ad76a18de2c7f05580b1f25153624161aadcc7b8ef47b0c4
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 19 Apr 2018 12:35:22 +0200
Finished: Thu, 19 Apr 2018 12:35:23 +0200
Ready: False
Restart Count: 13
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vkpjm (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-vkpjm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vkpjm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 43m default-scheduler Successfully assigned dockerproject2-64fbf7649-spc7h to aks-nodepool1-39744669-0
Normal SuccessfulMountVolume 43m kubelet, aks-nodepool1-39744669-0 MountVolume.SetUp succeeded for volume "default-token-vkpjm"
Normal Pulled 43m (x4 over 43m) kubelet, aks-nodepool1-39744669-0 Container image "acontainerregistry.azurecr.io/java-app:v1" already present on machine
Normal Created 43m (x4 over 43m) kubelet, aks-nodepool1-39744669-0 Created container
Normal Started 43m (x4 over 43m) kubelet, aks-nodepool1-39744669-0 Started container
Warning FailedSync 8m (x161 over 43m) kubelet, aks-nodepool1-39744669-0 Error syncing pod
Warning BackOff 3m (x184 over 43m) kubelet, aks-nodepool1-39744669-0 Back-off restarting failed container
When you run an application in the Pod, Kubernetes expects that it will work all the time as a daemon until you will stop it somehow.
In your details about the pod I see this:
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 19 Apr 2018 12:35:22 +0200
Finished: Thu, 19 Apr 2018 12:35:23 +0200
It means that your application exited with code 0 (which means "all is ok") right after start. So, the image was successfully downloaded (registry is OK) and run, but the application exited.
That's why Kubernetes tries to restart the pod all the time.
The only thing I can suggest - find a reason why the application stops and fix it.

How do solve this error when running apache spark 2.3.0 on Kubernetes with a jar from a remote source

Following the instructions here I have been trying to submit a spark job to minikube, using a remote URL:
minikube start
bin/spark-submit --master k8s://https://192.168.99.100:8443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=1 --conf spark.kubernetes.container.image=<default-spark-k8s-image-build> --conf spark.kubernetes.namespace=spark <https://remote-location-with-spark-example-jar>
The pod fails and when I describe it I get the error configmaps "spark-pi-ad386ea0f7e4333dbd2a0ad705e94d66-init-config" not found:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 50s default-scheduler Successfully assigned spark-pi-ad386ea0f7e4333dbd2a0ad705e94d66-driver to minikube
Warning FailedMount 49s kubelet, minikube MountVolume.SetUp failed for volume "spark-init-properties" : **configmaps "spark-pi-ad386ea0f7e4333dbd2a0ad705e94d66-init-config" not found**
Normal SuccessfulMountVolume 49s kubelet, minikube MountVolume.SetUp succeeded for volume "download-jars-volume"
Normal SuccessfulMountVolume 49s kubelet, minikube MountVolume.SetUp succeeded for volume "download-files-volume"
Normal SuccessfulMountVolume 49s kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-4ghj8"
Normal SuccessfulMountVolume 49s kubelet, minikube MountVolume.SetUp succeeded for volume "spark-init-properties"
Normal Pulled 49s kubelet, minikube Container image "timg-spark/spark:latest" already present on machine
Normal Created 49s kubelet, minikube Created container
Normal Started 48s kubelet, minikube Started container
Normal Pulled 43s kubelet, minikube Container image "timmeh/spark:latest" already present on machine
Normal Created 43s kubelet, minikube Created container
Normal Started 43s kubelet, minikube Started container
However there is no mention in the docs of creating any configmaps, and because the name of the configmap isn't known until you run spark-submit, I can't create one in advance to get more information.
For now my plan is to work around by baking in jar files to the spark docker image, but if anyone knows more on why this is failing that'd be great!

Resources