Elasticsearch PVC - azure

I am trying to build a es cluster using the helm chart with the following es yaml: values:
resources:
requests:
cpu: ".1"
memory: "2Gi"
limits:
cpu: "1"
memory: "3.5Gi"
volumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 500Gi
esConfig:
elasticsearch.yml: |
path.data: /mnt/azure
The problem is that the pods are throwing the following error at the start
"Caused by: java.nio.file.AccessDeniedException: /mnt/azure"
I put the azure disk as default storage in order not to specify the storage class. I don't know if this is the best practice or should i create the storage and after that mount it to the pods

You need to keep init container to change mounted directory ownership
You can update your path as per need, for you changes will be for /mnt/azure
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
Example stateful sets file
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
spec:
podManagementPolicy: Parallel
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
creationTimestamp: null
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: <SET THIS>
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
image: elasticsearch:6.5.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 250m
memory: 1Gi
requests:
cpu: 150m
memory: 512Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

The mounted Elasticsearch data directory by default is owned by root. Try the following container to change it before Elasticsearch starts:
initContainers:
- name: chown
image: busybox
imagePullPolicy: IfNotPresent
command:
- chown
args:
- 1000:1000
- /mnt/azure
volumeMounts:
- name: <your volume claim template name>
mountPath: /mnt/azure

Related

Apache Flink Operator - enable azure-fs-hadoop

I am trying to perform a flink job, using Flink Operator (https://github:com/apache/flink-kubernetes-operator) on k8s, that using uses a connection to Azure Blob Storage described here: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/azure/
Following the guideline I need to copy the jar file flink-azure-fs-hadoop-1.15.0.jar from one directory to another.
I have already tried to do it via podTemplate and command functionality, but unfortunately it does not work, and the file does not appear in the destination directory.
Can you guide me on how to do it properly?
Below you can find my FlinkDeployment file.
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
namespace: flink
name: basic-example
spec:
image: flink:1.15
flinkVersion: v1_15
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
serviceAccount: flink
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: pod-template
spec:
serviceAccount: flink
containers:
- name: flink-main-container
volumeMounts:
- mountPath: /opt/flink/data
name: flink-data
# command:
# - "touch"
# - "/tmp/test.txt"
volumes:
- name: flink-data
emptyDir: { }
jobManager:
resource:
memory: "2048m"
cpu: 1
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: job-manager-pod-template
spec:
initContainers:
- name: fetch-jar
image: cirrusci/wget
volumeMounts:
- mountPath: /opt/flink/data
name: flink-data
command:
- "wget"
- "LINK_TO_CUSTOM_JAR_FILE_ON_AZURE_BLOB_STORAGE"
- "-O"
- "/opt/flink/data/test.jar"
containers:
- name: flink-main-container
command:
- "touch"
- "/tmp/test.txt"
taskManager:
resource:
memory: "2048m"
cpu: 1
job:
jarURI: local:///opt/flink/data/test.jar
parallelism: 2
upgradeMode: stateless
state: running
ingress:
template: "CUSTOM_LINK_TO_AZURE"
annotations:
cert-manager.io/cluster-issuer: letsencrypt
kubernetes.io/ingress.allow-http: 'false'
traefik.ingress.kubernetes.io/router.entrypoints: websecure
traefik.ingress.kubernetes.io/router.tls: 'true'
traefik.ingress.kubernetes.io/router.tls.options: default
Since you are using the stock Flink 1.15 image this Azure filesystem plugin comes built-in. You can enable it via setting the ENABLE_BUILT_IN_PLUGINS environment variable.
spec:
podTemplate:
containers:
# Do not change the main container name
- name: flink-main-container
env:
- name: ENABLE_BUILT_IN_PLUGINS
value: flink-azure-fs-hadoop-1.15.0.jar
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#using-filesystem-plugins

cAdvisor : Could not configure a source for OOM detection

I have deployed cAdvisor DaemonSet on Kubernetes (EKS) with following manifest
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: cadvisor
namespace: kube-monitoring
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: cadvisor
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- cadvisor
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: cadvisor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cadvisor
subjects:
- kind: ServiceAccount
name: cadvisor
namespace: kube-monitoring
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cadvisor
namespace: kube-monitoring
spec:
selector:
matchLabels:
name: cadvisor
template:
metadata:
labels:
name: cadvisor
spec:
serviceAccountName: cadvisor
containers:
- name: cadvisor
image: google/cadvisor:latest
resources:
requests:
memory: 400Mi
cpu: 400m
limits:
memory: 2000Mi
cpu: 800m
ports:
- name: http
containerPort: 8080
protocol: TCP
volumeMounts:
- name: rootfs
mountPath: /rootfs
readOnly: true
- name: var-run
mountPath: /var/run
readOnly: true
- name: sys
mountPath: /sys
readOnly: true
- name: docker
mountPath: /var/lib/docker
readOnly: true
- name: disk
mountPath: /dev/disk
readOnly: true
automountServiceAccountToken: false
terminationGracePeriodSeconds: 30
volumes:
- name: rootfs
hostPath:
path: /
- name: var-run
hostPath:
path: /var/run
- name: sys
hostPath:
path: /sys
- name: docker
hostPath:
path: /var/lib/docker
- name: disk
hostPath:
path: /dev/disk
---
But in the cAdvisor container logs I see following messages
W0608 16:00:47.238042 1 manager.go:349] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
I can connect to cAdvisor UI at http://localhost:8080/containers/ without any issue.
What is wrong in this cAdvisor setup ?
I solved the same issue with
privileged: true

Azure Key Vault integration with AKS works for nginx tutorial Pod, but not actual project deployment

Per the title, I have the integration working following the documentation.
I can deploy the nginx.yaml and after about 70 seconds I can print out secrets with:
kubectl exec -it nginx -- cat /mnt/secrets-store/secret1
Now I'm trying to apply it to a PostgreSQL deployment for testing and I get the following from the Pod description:
Warning FailedMount 3s kubelet MountVolume.SetUp failed for volume "secrets-store01-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod staging/postgres-deployment-staging-69965ff767-8hmww, err: rpc error: code = Unknown desc = failed to mount objects, error: failed to get keyvault client: failed to get key vault token: nmi response failed with status code: 404, err: <nil>
And from the nmi logs:
E0221 22:54:32.037357 1 server.go:234] failed to get identities, error: getting assigned identities for pod staging/postgres-deployment-staging-69965ff767-8hmww in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
I0221 22:54:32.037409 1 server.go:192] status (404) took 80003389208 ns for req.method=GET reg.path=/host/token/ req.remote=127.0.0.1
Not sure why since I basically copied the settings from the nignx.yaml into the postgres.yaml. Here they are:
# nginx.yaml
kind: Pod
apiVersion: v1
metadata:
name: nginx
namespace: staging
labels:
aadpodidbinding: aks-akv-identity-binding-selector
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment-staging
namespace: staging
labels:
aadpodidbinding: aks-akv-identity-binding-selector
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
containers:
- name: postgres
image: postgres:13-alpine
ports:
- containerPort: 5432
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
- name: postgres-storage-staging
mountPath: /var/postgresql
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
- name: postgres-storage-staging
persistentVolumeClaim:
claimName: postgres-storage-staging
---
apiVersion: v1
kind: Service
metadata:
name: postgres-cluster-ip-service-staging
namespace: staging
spec:
type: ClusterIP
selector:
component: postgres
ports:
- port: 5432
targetPort: 5432
Suggestions for what the issue is here?
Oversight on my part... the aadpodidbinding should be in the template: per:
https://azure.github.io/aad-pod-identity/docs/best-practices/#deploymenthttpskubernetesiodocsconceptsworkloadscontrollersdeployment
The resulting YAML should be:
# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment-production
namespace: production
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
aadpodidbinding: aks-akv-identity-binding-selector
spec:
containers:
- name: postgres
image: postgres:13-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB_FILE
value: /mnt/secrets-store/DEV-PGDATABASE
- name: POSTGRES_USER_FILE
value: /mnt/secrets-store/DEV-PGUSER
- name: POSTGRES_PASSWORD_FILE
value: /mnt/secrets-store/DEV-PGPASSWORD
- name: POSTGRES_INITDB_ARGS
value: "-A md5"
- name: PGDATA
value: /var/postgresql/data
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
- name: postgres-storage-production
mountPath: /var/postgresql
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
- name: postgres-storage-production
persistentVolumeClaim:
claimName: postgres-storage-production
---
apiVersion: v1
kind: Service
metadata:
name: postgres-cluster-ip-service-production
namespace: production
spec:
type: ClusterIP
selector:
component: postgres
ports:
- port: 5432
targetPort: 5432
Adding template in spec will resolve the issue, use label "aadpodidbinding: "your azure pod identity selector" in the template labels section in deployment.yaml file
sample deployment file
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
aadpodidbinding: azure-pod-identity-binding-selector
spec:
containers:
- name: nginx
image: nginx
env:
- name: SECRET
valueFrom:
secretKeyRef:
name: test-secret
key: key
volumeMounts:
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: dev-1spc

How to inject evnironment variables to driver pod when using spark-on-k8s?

I am writing a Kubernetes Spark Application using GCP spark on k8s.
Currently, I am stuck at not being able to inject environment variables into my container.
I am following the doc here
Manifest:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-search-indexer
namespace: spark-operator
spec:
type: Scala
mode: cluster
image: "gcr.io/spark-operator/spark:v2.4.5"
imagePullPolicy: Always
mainClass: com.quid.indexer.news.jobs.ESIndexingJob
mainApplicationFile: "https://lala.com/baba-0.0.43.jar"
arguments:
- "--esSink"
- "http://something:9200/mo-sn-{yyyy-MM}-v0.0.43/searchable-article"
- "-streaming"
- "--kafkaTopics"
- "annotated_blogs,annotated_ln_news,annotated_news"
- "--kafkaBrokers"
- "10.1.1.1:9092"
sparkVersion: "2.4.5"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
Environment Variables set at pod:
Environment:
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-1ed8539d-b157-4fab-9aa6-daff5789bfb5
SPARK_CONF_DIR: /opt/spark/conf
It turns out to use this one must enable webhooks (how to set up in quick-start guide here)
The other approach could be to use envVars
Example:
spec:
executor:
envVars:
DEMOGRAPHICS_ES_URI: "somevalue"
Ref: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/978

write access error for mounted volume on kubernetes

When we were deploying active-mq in azure kubernetes service(aks), where active-mq data folder mounted on azure managed disk as a persistent volume claim. Below is the yaml used for deployment.
ActiveMQ Image used: rmohr/activemq
Kubernetes Version: v1.15.7
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: activemqcontainer
spec:
replicas: 1
selector:
matchLabels:
app: activemqcontainer
template:
metadata:
labels:
app: activemqcontainer
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
runAsNonRoot: false
containers:
- name: web
image: azureregistry.azurecr.io/rmohractivemq
imagePullPolicy: IfNotPresent
ports:
- containerPort: 61616
volumeMounts:
- mountPath: /opt/activemq/data
subPath: data
name: volume
- mountPath: /opt/apache-activemq-5.15.6/conf/activemq.xml
name: config-xml
subPath: activemq.xml
imagePullSecrets:
- name: secret
volumes:
- name: config-xml
configMap:
name: active-mq-xml
- name: volume
persistentVolumeClaim:
claimName: azure-managed-disk
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-premium
resources:
requests:
storage: 100Gi
Getting below error.
WARN | Failed startup of context o.e.j.w.WebAppContext#517566b{/admin,file:/opt/apache-activemq-5.15.6/webapps/admin/,null}
java.lang.IllegalStateException: Parent for temp dir not configured correctly: writeable=false
at org.eclipse.jetty.webapp.WebInfConfiguration.makeTempDirectory(WebInfConfiguration.java:336)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebInfConfiguration.resolveTempDirectory(WebInfConfiguration.java:304)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfiguration.java:69)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebAppContext.preConfigure(WebAppContext.java:468)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:504)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)[jetty-all-9.2.25.v20180606.jar:9.2.2
Its a warning from activemq web admin console. Jetty which hosts web console is unable to create temp directory.
WARN | Failed startup of context o.e.j.w.WebAppContext#517566b{/admin,file:/opt/apache-activemq-5.15.6/webapps/admin/,null}
java.lang.IllegalStateException: Parent for temp dir not configured correctly: writeable=false
You can override default temp directory by setting up environment variable ACTIVEMQ_TMP as below in container spec
env:
- name: ACTIVEMQ_TMP
value : "/tmp"

Resources