Elasticsearch Pod failing after Init state without logs - azure

I'm trying to get an Elasticsearch StatefulSet to work on AKS but the pods fail and are terminated before I'm able to see any logs. Is there a way to see the logs after the Pods are terminated?
This is the sample YAML file I'm running with kubectl apply -f es-statefulset.yaml:
# RBAC authn and authz
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- "services"
- "namespaces"
- "endpoints"
verbs:
- "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: elasticsearch-logging
namespace: kube-system
apiGroup: ""
roleRef:
kind: ClusterRole
name: elasticsearch-logging
apiGroup: ""
---
# Elasticsearch deployment itself
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
version: v6.4.1
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
serviceName: elasticsearch-logging
replicas: 2
selector:
matchLabels:
k8s-app: elasticsearch-logging
version: v6.4.1
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: v6.4.1
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: elasticsearch-logging
containers:
- image: docker.elastic.co/elasticsearch/elasticsearch:6.4.1
name: elasticsearch-logging
resources:
# need more cpu upon initialization, therefore burstable class
limits:
cpu: "1000m"
memory: "2048Mi"
requests:
cpu: "100m"
memory: "1024Mi"
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: elasticsearch-logging
mountPath: /data
env:
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "bootstrap.memory_lock"
value: "true"
- name: "ES_JAVA_OPTS"
value: "-Xms1024m -Xmx2048m"
- name: "discovery.zen.ping.unicast.hosts"
value: "elasticsearch-logging"
# A) This volume mount (emptyDir) can be set whenever not working with a
# cloud provider. There will be no persistence. If you want to avoid
# data wipeout when the pod is recreated make sure to have a
# "volumeClaimTemplates" in the bottom.
# volumes:
# - name: elasticsearch-logging
# emptyDir: {}
#
# Elasticsearch requires vm.max_map_count to be at least 262144.
# If your OS already sets up this number to a higher value, feel free
# to remove this init container.
initContainers:
- image: alpine:3.6
command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
name: elasticsearch-logging-init
securityContext:
privileged: true
# B) This will request storage on Azure (configure other clouds if necessary)
volumeClaimTemplates:
- metadata:
name: elasticsearch-logging
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: default
resources:
requests:
storage: 64Gi
When I "follow" the pods creating looks like this:
I tried to get the logs from the terminated instance by doing logs -n kube-system elasticsearch-logging-0 -p and noting.
I'm trying to build on top of this sample from the official
(unmaintained) k8s repo. Which worked at first, but after I tried updating the deployment I had it just completely failed and I haven't been able to get it back up. I'm using the trial version of Azure AKS
I appreciate any suggestions
EDIT 1:
The result of kubectl describe statefulset elasticsearch-logging -n kube-system is the following (with an almost identical Init-Terminated pod flow):
Name: elasticsearch-logging
Namespace: kube-system
CreationTimestamp: Mon, 24 Sep 2018 10:09:07 -0600
Selector: k8s-app=elasticsearch-logging,version=v6.4.1
Labels: addonmanager.kubernetes.io/mode=Reconcile
k8s-app=elasticsearch-logging
kubernetes.io/cluster-service=true
version=v6.4.1
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"apps/v1","kind":"StatefulSet","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"elasticsea...
Replicas: 0 desired | 1 total
Update Strategy: RollingUpdate
Pods Status: 0 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: k8s-app=elasticsearch-logging
kubernetes.io/cluster-service=true
version=v6.4.1
Service Account: elasticsearch-logging
Init Containers:
elasticsearch-logging-init:
Image: alpine:3.6
Port: <none>
Host Port: <none>
Command:
/sbin/sysctl
-w
vm.max_map_count=262144
Environment: <none>
Mounts: <none>
Containers:
elasticsearch-logging:
Image: docker.elastic.co/elasticsearch/elasticsearch:6.4.1
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
Limits:
cpu: 1
memory: 2Gi
Requests:
cpu: 100m
memory: 1Gi
Environment:
NAMESPACE: (v1:metadata.namespace)
bootstrap.memory_lock: true
ES_JAVA_OPTS: -Xms1024m -Xmx2048m
discovery.zen.ping.unicast.hosts: elasticsearch-logging
Mounts:
/data from elasticsearch-logging (rw)
Volumes: <none>
Volume Claims:
Name: elasticsearch-logging
StorageClass: default
Labels: <none>
Annotations: <none>
Capacity: 64Gi
Access Modes: [ReadWriteMany]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 53s statefulset-controller create Pod elasticsearch-logging-0 in StatefulSet elasticsearch-logging successful
Normal SuccessfulDelete 1s statefulset-controller delete Pod elasticsearch-logging-0 in StatefulSet elasticsearch-logging successful
The flow remains the same:

You're assuming that the pods are terminated due to an ES related error.
I'm not so sure ES even got to run to begin with, which should explain the lack of logs.
Having multiple pods with the same name is extremely suspicious, especially in a StatefulSet, so something's wrong there.
I'd try kubectl describe statefulset elasticsearch-logging -n kube-system first, that should explain what's going on -- probably some issue mounting the volumes prior to running ES.
I'm also pretty sure you want to change ReadWriteOnce to ReadWriteMany.
Hope this helps!

Yes. There's a way. You can ssh into the machine running your pods, and assuming you are using Docker you can run:
docker ps -a # Shows all the Exited containers (some of those, part of your pod)
Then:
docker logs <container-id-of-your-exited-elasticsearch-container>
This also works if you are using CRIO or Containerd and it would be something like
crictl logs <container-id>

Related

Why unable to access Azure aks Service external IP with browser?

apiVersion: apps/v1
kind: Deployment
metadata:
name: farwell-helloworld-net3-webapi
spec:
replicas: 1
selector:
matchLabels:
app: farwell-helloworld-net3-webapi
template:
metadata:
labels:
app: farwell-helloworld-net3-webapi
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: farwell-helloworld-net3-webapi
image: devcontainerregistry.azurecr.cn/farwell.helloworld.net3.webapi:latest
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 8077
---
apiVersion: v1
kind: Service
metadata:
name: farwell-helloworld-net3-webapi
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8077
selector:
app: farwell-helloworld-net3-webapi
I use this command: kubectl get service farwell-helloworld-net3-webapi --watch
then i get this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
farwell-helloworld-net3-webapi LoadBalancer 10.0.116.13 52.131.xxx.xxx 80:30493/TCP 13m
I have already open the azure port 80.But I cannot access http://52.131.xxx.xxx/WeatherForecast
Could you help me pls? or tell me some steps to help to find the reason?
It seems that I changed the port number from 8077 to 80, then it runs well.

Kubernetes Crashloopbackoff With Minikube

So I am learning about Kubernetes with a guide, I am trying to deploy a MongoDB Pod with 1 replica. This is the deployment config file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb-deployment
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo
ports:
- containerPort: 27017
env:
- name: MONGO_INITDB_ROOT_USERNAME
valueFrom:
secretKeyRef:
name: mongodb-secret
key: mongo-root-username
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mongodb-secret
key: mongo-root-password
---
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
selector:
app: mongodb
ports:
- protocol: TCP
port: 27017
targetPort: 27017
I also try to deploy a Mongo-Express Pod with almost the same config file, but I keep getting CrashLoopBackOff for both Pods, From the little understanding I have, this is caused by the container failing and restarting in a cycle. I tried going through the events with kubectl get events and I see that a warning with message Back-off restarting failed container keeps occurring. I also tried doing a little digging around and came across a solution that says to add
command: ['sleep']
args: ['infinity']
That fixed the CrashLoopBackOff issue, but when I try to get the logs for the Pod, nothing is displayed on the terminal. Please I need some help and possible explanation as how the command and args seem to fix it, also how do I stop this crash from happening to my Pods and current one, Thank you very much.
My advice is to deploy MongoDB as StatefulSet on Kubernetes.
In stateful application, the N-replicas of master nodes manages several worker nodes under a cluster. So, if any master node goes down the other ordinal instances will be active to execute the workflow. The master node instances must be identified as a unique ordinal number known as StatefulSet.
See more: mongodb-sts, mongodb-on-kubernetes.
Also use Headless service to manage the domain of a Pod. In general understanding of Headless Service, there is no need for LoadBalancer or a kube-proxy to interact directly with Pods but using a Service IP, so the Cluster IP is set to none.
In your case:
apiVersion: v1
kind: Service
metadata:
name: mongodb
spec:
clusterIP: None
selector:
app: mongodb
ports:
- port: 27017
The error:
Also uncaught exception: Error: couldn't add user: Error preflighting normalization: U_STRINGPREP_PROHIBITED_ERROR _getErrorWithCode#src/mongo/shell/utils.js:25:13
indicates that the secret may be missing. Take a look: mongodb-initializating.
In your case secret should look similar:
apiVersion: v1
kind: Secret
metadata:
name: mongodb-secret
type: Opaque
data:
mongo-root-username: YWRtaW4=
mongo-root-password: MWYyZDFlMmU2N2Rm
Remember to configure also a volume for your pods - follow tutorials I have linked above.
Deploy mongodb with StatefulSet not as deployment.
Example:
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
labels:
name: mongo
spec:
ports:
port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongod
spec:
serviceName: mongodb-service
replicas: 3
template:
metadata:
labels:
role: mongo
environment: test
replicaset: MainRepSet
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: replicaset
operator: In
values:
- MainRepSet
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 10
volumes:
- name: secrets-volume
secret:
secretName: shared-bootstrap-data
defaultMode: 256
containers:
- name: mongod-container
#image: pkdone/mongo-ent:3.4
image: mongo
command:
- "numactl"
- "--interleave=all"
- "mongod"
- "--wiredTigerCacheSizeGB"
- "0.1"
- "--bind_ip"
- "0.0.0.0"
- "--replSet"
- "MainRepSet"
- "--auth"
- "--clusterAuthMode"
- "keyFile"
- "--keyFile"
- "/etc/secrets-volume/internal-auth-mongodb-keyfile"
- "--setParameter"
- "authenticationMechanisms=SCRAM-SHA-1"
resources:
requests:
cpu: 0.2
memory: 200Mi
ports:
- containerPort: 27017
volumeMounts:
- name: secrets-volume
readOnly: true
mountPath: /etc/secrets-volume
- name: mongodb-persistent-storage-claim
mountPath: /data/db
volumeClaimTemplates:
metadata:
name: mongodb-persistent-storage-claim
annotations:
volume.beta.kubernetes.io/storage-class: "standard"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi

Create a new volume when pod restart in a statefulset

I'm using azure aks to create a statefulset with volume using azure disk provisioner.
I'm trying to find a way to write my statefulset YAML file in a way that when a pod restarts, it will get a new Volume and the old volume will be deleted.
I know I can delete volumes manually, but is there any ways to tell Kubernetes to do this via statefulset yaml?
Here is my Yaml:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: janusgraph
labels:
app: janusgraph
spec:
...
...
template:
metadata:
labels:
app: janusgraph
spec:
containers:
- name: janusgraph
...
...
volumeMounts:
- name: data
mountPath: /var/lib/janusgraph
livenessProbe:
httpGet:
port: 8182
path: ?gremlin=g.V(123).count()
initialDelaySeconds: 120
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "default"
resources:
requests:
storage: 7Gi
If you want your data to be deleted when the pod restarts, you can use an ephemeral volume like EmptyDir.
When a Pod is removed/restarted for any reason, the data in the emptyDir is deleted forever.
Sample:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumes:
- name: www
emptyDir: {}
N.B.:
By default, emptyDir volumes are stored on whatever medium is backing the node - that might be disk or SSD or network storage, depending on your environment. However, you can set the emptyDir.medium field to "Memory" to tell Kubernetes to mount a tmpfs (RAM-backed filesystem) for you instead.

Kubernetes: Cassandra(stateful set) deployment on GCP

Has anyone tried deploying Cassandra (POC) on GCP using kubernetes (not GKE). If so can you please share info on how to get it working?
You could start by looking at IBM's Scalable-Cassandra-deployment-on-Kubernetes.
For seeds discovery you can use a headless service, similar to this Multi-node Cassandra Cluster Made Easy with Kubernetes.
Some difficulties:
fast local storage for K8s is still in beta; of course, you can use what k8s already has; there are some users reporting that they use Ceph RBD with 8 C* nodes each of them having 2TB of data on K8s.
at some point in time you will realize that you need a C* operator - here is some good startup - Instaclustr's Cassandra Operator and Pantheon Systems' Cassandra Operator
you need a way to scale in gracefully stateful applications (should be also covered by the operator; this is a solution if you don't want an operator, but you still need to use a controller).
You could also check the Cassandra mailing list, since there are people there already using Cassandra over K8s in production.
I have implemented cassandra on kubernetes. Please find my deployment and service yaml files:
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v12
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "fast"
resources:
requests:
storage: 5Gi
Hope this helps.
Use Helm:
On Mac:
brew install helm#2
brew link --force helm#2
helm init
To Avoid Kubernetes Helm permission Hell:
from: https://github.com/helm/helm/issues/2224:
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
Cassandra
incubator:
helm repo add https://github.com/helm/charts/tree/master/incubator/cassandra
helm install --namespace "cassandra" -n "cassandra" incubator/cassandra
helm status "cassandra"
helm delete --purge "cassandra"
bitnami:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install --namespace "cassandra" -n "my-deployment" bitnami/cassandra
helm status "my-deployment"
helm delete --purge "my-deployment"

How to mount a volume with a windows container in kubernetes?

i'm trying to mount a persistent volume into my windows container, but i alwys get this error:
Unable to mount volumes for pod "mssql-with-pv-deployment-3263067711-xw3mx_default(....)": timeout expired waiting for volumes to attach/mount for pod "default"/"mssql-with-pv-deployment-3263067711-xw3mx". list of unattached/unmounted volumes=[blobdisk01]
i've created a github gist with the console output of "get events" and "describe sc | pvc | po" maybe someone will find the solution with it.
Below are my scripts that I'm using for deployment.
my storageclass:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azure-disk-sc
provisioner: kubernetes.io/azure-disk
parameters:
skuname: Standard_LRS
my PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-disk-pvc
spec:
storageClassName: azure-disk-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
and the deployment of my container:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: mssql-with-pv-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: mssql-with-pv
spec:
nodeSelector:
beta.kubernetes.io/os: windows
terminationGracePeriodSeconds: 10
containers:
- name: mssql-with-pv
image: testacr.azurecr.io/sql/mssql-server-windows-developer
ports:
- containerPort: 1433
env:
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
volumeMounts:
- mountPath: "c:/volume"
name: blobdisk01
volumes:
- name: blobdisk01
persistentVolumeClaim:
claimName: azure-disk-pvc
---
apiVersion: v1
kind: Service
metadata:
name: mssql-with-pv-deployment
spec:
selector:
app: mssql-with-pv
ports:
- protocol: TCP
port: 1433
targetPort: 1433
type: LoadBalancer
what am i doing wrong? is there another way to mount a volume?
thank for every help :)
I would try:
Change API version to v1: https://kubernetes.io/docs/concepts/storage/storage-classes/#azure-disk
kubectl get events to see you if have a more detailed error (I could figure out the reason when I used NFS watching events)
maybe is this bug, I read in this post?
You will need a new volume in D: drive, looks like folders in C: are not supported for Windows Containers, see here:
https://github.com/kubernetes/kubernetes/issues/65060
Demos:
https://github.com/andyzhangx/demo/tree/master/windows/azuredisk

Resources