How to inject evnironment variables to driver pod when using spark-on-k8s? - apache-spark

I am writing a Kubernetes Spark Application using GCP spark on k8s.
Currently, I am stuck at not being able to inject environment variables into my container.
I am following the doc here
Manifest:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-search-indexer
namespace: spark-operator
spec:
type: Scala
mode: cluster
image: "gcr.io/spark-operator/spark:v2.4.5"
imagePullPolicy: Always
mainClass: com.quid.indexer.news.jobs.ESIndexingJob
mainApplicationFile: "https://lala.com/baba-0.0.43.jar"
arguments:
- "--esSink"
- "http://something:9200/mo-sn-{yyyy-MM}-v0.0.43/searchable-article"
- "-streaming"
- "--kafkaTopics"
- "annotated_blogs,annotated_ln_news,annotated_news"
- "--kafkaBrokers"
- "10.1.1.1:9092"
sparkVersion: "2.4.5"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
Environment Variables set at pod:
Environment:
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-1ed8539d-b157-4fab-9aa6-daff5789bfb5
SPARK_CONF_DIR: /opt/spark/conf

It turns out to use this one must enable webhooks (how to set up in quick-start guide here)
The other approach could be to use envVars
Example:
spec:
executor:
envVars:
DEMOGRAPHICS_ES_URI: "somevalue"
Ref: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/978

Related

Apache Flink Operator - enable azure-fs-hadoop

I am trying to perform a flink job, using Flink Operator (https://github:com/apache/flink-kubernetes-operator) on k8s, that using uses a connection to Azure Blob Storage described here: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/azure/
Following the guideline I need to copy the jar file flink-azure-fs-hadoop-1.15.0.jar from one directory to another.
I have already tried to do it via podTemplate and command functionality, but unfortunately it does not work, and the file does not appear in the destination directory.
Can you guide me on how to do it properly?
Below you can find my FlinkDeployment file.
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
namespace: flink
name: basic-example
spec:
image: flink:1.15
flinkVersion: v1_15
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
serviceAccount: flink
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: pod-template
spec:
serviceAccount: flink
containers:
- name: flink-main-container
volumeMounts:
- mountPath: /opt/flink/data
name: flink-data
# command:
# - "touch"
# - "/tmp/test.txt"
volumes:
- name: flink-data
emptyDir: { }
jobManager:
resource:
memory: "2048m"
cpu: 1
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: job-manager-pod-template
spec:
initContainers:
- name: fetch-jar
image: cirrusci/wget
volumeMounts:
- mountPath: /opt/flink/data
name: flink-data
command:
- "wget"
- "LINK_TO_CUSTOM_JAR_FILE_ON_AZURE_BLOB_STORAGE"
- "-O"
- "/opt/flink/data/test.jar"
containers:
- name: flink-main-container
command:
- "touch"
- "/tmp/test.txt"
taskManager:
resource:
memory: "2048m"
cpu: 1
job:
jarURI: local:///opt/flink/data/test.jar
parallelism: 2
upgradeMode: stateless
state: running
ingress:
template: "CUSTOM_LINK_TO_AZURE"
annotations:
cert-manager.io/cluster-issuer: letsencrypt
kubernetes.io/ingress.allow-http: 'false'
traefik.ingress.kubernetes.io/router.entrypoints: websecure
traefik.ingress.kubernetes.io/router.tls: 'true'
traefik.ingress.kubernetes.io/router.tls.options: default
Since you are using the stock Flink 1.15 image this Azure filesystem plugin comes built-in. You can enable it via setting the ENABLE_BUILT_IN_PLUGINS environment variable.
spec:
podTemplate:
containers:
# Do not change the main container name
- name: flink-main-container
env:
- name: ENABLE_BUILT_IN_PLUGINS
value: flink-azure-fs-hadoop-1.15.0.jar
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#using-filesystem-plugins

How to send args SparkKubernetesOperator Airflow

I have a DAG in Airflow running on Kubernetes with Spark.
How can I send aws credentials to a spark file using the SparkKubernetesOperator.
In my DAG file I get the credentials from the connections:
Example:
from airflow.hooks.base import BaseHook
aws_conn = BaseHook.get_connection('aws_conn')
How is it possible to send this aws_conn to the spark file through the operator?
transformation = SparkKubernetesOperator(
task_id='spark_transform_frete_new',
namespace='airflow',
application_file='spark/spark_transform_frete_new.yaml',
kubernetes_conn_id='kubernetes_default',
do_xcom_push=True,
)
The yaml file:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: "dag-example-spark-{{ macros.datetime.now().strftime("%Y-%m-%d-%H-%M-%S") }}-{{ task_instance.try_number }}"
namespace: airflow
spec:
timeToLiveSeconds: 30
volumes:
- name: ivy
persistentVolumeClaim:
claimName: dags-volume-pvc
- name: logs
persistentVolumeClaim:
claimName: logs-volume-pvc
sparkConf:
spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.2.0,org.apache.spark:spark-avro_2.12:3.0.1"
spark.driver.extraJavaOptions: "-Divy.cache.dir=/tmp -Divy.home=/tmp"
"spark.kubernetes.local.dirs.tmpfs": "true"
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "/logs/spark/"
hadoopConf:
fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
type: Python
pythonVersion: "3"
mode: cluster
image: "myimagespark/spark-dev"
imagePullPolicy: Always
mainApplicationFile: local:///dags/dag_example_python_spark/src/spark/spark_transform_frete_new.py
sparkVersion: "3.1.1"
restartPolicy:
type: Never
driver:
cores: 1
coreLimit: "1200m"
memory: "4g"
labels:
version: 3.1.1
serviceAccount: spark
volumeMounts:
- name: ivy
mountPath: /dags
- name: logs
mountPath: /logs/spark/
executor:
cores: 2
instances: 2
memory: "3g"
labels:
version: 3.1.1
volumeMounts:
- name: ivy
mountPath: /dags
- name: logs
mountPath: /logs/spark/

Azure Key Vault integration with AKS works for nginx tutorial Pod, but not actual project deployment

Per the title, I have the integration working following the documentation.
I can deploy the nginx.yaml and after about 70 seconds I can print out secrets with:
kubectl exec -it nginx -- cat /mnt/secrets-store/secret1
Now I'm trying to apply it to a PostgreSQL deployment for testing and I get the following from the Pod description:
Warning FailedMount 3s kubelet MountVolume.SetUp failed for volume "secrets-store01-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod staging/postgres-deployment-staging-69965ff767-8hmww, err: rpc error: code = Unknown desc = failed to mount objects, error: failed to get keyvault client: failed to get key vault token: nmi response failed with status code: 404, err: <nil>
And from the nmi logs:
E0221 22:54:32.037357 1 server.go:234] failed to get identities, error: getting assigned identities for pod staging/postgres-deployment-staging-69965ff767-8hmww in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
I0221 22:54:32.037409 1 server.go:192] status (404) took 80003389208 ns for req.method=GET reg.path=/host/token/ req.remote=127.0.0.1
Not sure why since I basically copied the settings from the nignx.yaml into the postgres.yaml. Here they are:
# nginx.yaml
kind: Pod
apiVersion: v1
metadata:
name: nginx
namespace: staging
labels:
aadpodidbinding: aks-akv-identity-binding-selector
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment-staging
namespace: staging
labels:
aadpodidbinding: aks-akv-identity-binding-selector
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
containers:
- name: postgres
image: postgres:13-alpine
ports:
- containerPort: 5432
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
- name: postgres-storage-staging
mountPath: /var/postgresql
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
- name: postgres-storage-staging
persistentVolumeClaim:
claimName: postgres-storage-staging
---
apiVersion: v1
kind: Service
metadata:
name: postgres-cluster-ip-service-staging
namespace: staging
spec:
type: ClusterIP
selector:
component: postgres
ports:
- port: 5432
targetPort: 5432
Suggestions for what the issue is here?
Oversight on my part... the aadpodidbinding should be in the template: per:
https://azure.github.io/aad-pod-identity/docs/best-practices/#deploymenthttpskubernetesiodocsconceptsworkloadscontrollersdeployment
The resulting YAML should be:
# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment-production
namespace: production
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
aadpodidbinding: aks-akv-identity-binding-selector
spec:
containers:
- name: postgres
image: postgres:13-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB_FILE
value: /mnt/secrets-store/DEV-PGDATABASE
- name: POSTGRES_USER_FILE
value: /mnt/secrets-store/DEV-PGUSER
- name: POSTGRES_PASSWORD_FILE
value: /mnt/secrets-store/DEV-PGPASSWORD
- name: POSTGRES_INITDB_ARGS
value: "-A md5"
- name: PGDATA
value: /var/postgresql/data
volumeMounts:
- name: secrets-store01-inline
mountPath: /mnt/secrets-store
readOnly: true
- name: postgres-storage-production
mountPath: /var/postgresql
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: aks-akv-secret-provider
- name: postgres-storage-production
persistentVolumeClaim:
claimName: postgres-storage-production
---
apiVersion: v1
kind: Service
metadata:
name: postgres-cluster-ip-service-production
namespace: production
spec:
type: ClusterIP
selector:
component: postgres
ports:
- port: 5432
targetPort: 5432
Adding template in spec will resolve the issue, use label "aadpodidbinding: "your azure pod identity selector" in the template labels section in deployment.yaml file
sample deployment file
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
aadpodidbinding: azure-pod-identity-binding-selector
spec:
containers:
- name: nginx
image: nginx
env:
- name: SECRET
valueFrom:
secretKeyRef:
name: test-secret
key: key
volumeMounts:
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: dev-1spc

write access error for mounted volume on kubernetes

When we were deploying active-mq in azure kubernetes service(aks), where active-mq data folder mounted on azure managed disk as a persistent volume claim. Below is the yaml used for deployment.
ActiveMQ Image used: rmohr/activemq
Kubernetes Version: v1.15.7
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: activemqcontainer
spec:
replicas: 1
selector:
matchLabels:
app: activemqcontainer
template:
metadata:
labels:
app: activemqcontainer
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
runAsNonRoot: false
containers:
- name: web
image: azureregistry.azurecr.io/rmohractivemq
imagePullPolicy: IfNotPresent
ports:
- containerPort: 61616
volumeMounts:
- mountPath: /opt/activemq/data
subPath: data
name: volume
- mountPath: /opt/apache-activemq-5.15.6/conf/activemq.xml
name: config-xml
subPath: activemq.xml
imagePullSecrets:
- name: secret
volumes:
- name: config-xml
configMap:
name: active-mq-xml
- name: volume
persistentVolumeClaim:
claimName: azure-managed-disk
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-premium
resources:
requests:
storage: 100Gi
Getting below error.
WARN | Failed startup of context o.e.j.w.WebAppContext#517566b{/admin,file:/opt/apache-activemq-5.15.6/webapps/admin/,null}
java.lang.IllegalStateException: Parent for temp dir not configured correctly: writeable=false
at org.eclipse.jetty.webapp.WebInfConfiguration.makeTempDirectory(WebInfConfiguration.java:336)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebInfConfiguration.resolveTempDirectory(WebInfConfiguration.java:304)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfiguration.java:69)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebAppContext.preConfigure(WebAppContext.java:468)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:504)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)[jetty-all-9.2.25.v20180606.jar:9.2.25.v20180606]
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)[jetty-all-9.2.25.v20180606.jar:9.2.2
Its a warning from activemq web admin console. Jetty which hosts web console is unable to create temp directory.
WARN | Failed startup of context o.e.j.w.WebAppContext#517566b{/admin,file:/opt/apache-activemq-5.15.6/webapps/admin/,null}
java.lang.IllegalStateException: Parent for temp dir not configured correctly: writeable=false
You can override default temp directory by setting up environment variable ACTIVEMQ_TMP as below in container spec
env:
- name: ACTIVEMQ_TMP
value : "/tmp"

How to mount cassandra data location to azure file share using stateful set kubernetes

I am setting up 3 node Cassandra cluster on Azure using Statefull set Kubernetes and not able to mount data location in azure file share.
I am able to do using default kubenetes storage but not with Azurefile share option.
I have tried the following steps given below, finding difficulty in volumeClaimTemplates
apiVersion: "apps/v1"
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
env:
- name: CASSANDRA_SEEDS
value: cassandra-0.cassandra.default.svc.cluster.local
- name: MAX_HEAP_SIZE
value: 256M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_CLUSTER_NAME
value: "Cassandra"
- name: CASSANDRA_DC
value: "DC1"
- name: CASSANDRA_RACK
value: "Rack1"
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: pv002
volumeClaimTemplates:
- metadata:
name: pv002
spec:
storageClassName: default
accessModes:
- ReadWriteOnce
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv002
accessModes:
- ReadWriteOnce
azureFile:
secretName: storage-secret
shareName: xxxxx
readOnly: false
claimRef:
namespace: default
name: az-files-02
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: az-files-02
spec:
accessModes:
- ReadWriteOnce
---
apiVersion: v1
kind: Secret
metadata:
name: storage-secret
type: Opaque
data:
azurestorageaccountname: xxxxx
azurestorageaccountkey: jjbfjbsfljbafkljasfkl;jf;kjd;kjklsfdhjbsfkjbfkjbdhueueeknekneiononeojnjnjHBDEJKBJBSDJBDJ==
I should able to mount data folder of each cassandra node into azure file share.
For using azure file in statefulset, I think you could following this example: https://github.com/andyzhangx/demo/blob/master/linux/azurefile/attach-stress-test/statefulset-azurefile1-2files.yaml

Resources