I am toying with the spark operator in kubernetes, and I am trying to create a Spark Application resource with the following manifest.
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: pyspark-pi
namespace: spark-jobs
spec:
batchScheduler: volcano
batchSchedulerOptions:
priorityClassName: routine
type: Python
pythonVersion: "3"
mode: cluster
image: "<image_name>"
imagePullPolicy: Always
mainApplicationFile: local:///spark-files/csv_data.py
arguments:
- "10"
sparkVersion: "3.0.0"
restartPolicy:
type: OnFailure
onFailureRetries: 3
onFailureRetryInterval: 10
onSubmissionFailureRetries: 5
onSubmissionFailureRetryInterval: 20
timeToLiveSeconds: 86400
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 3.0.0
serviceAccount: driver-sa
volumeMounts:
- name: sparky-data
mountPath: /spark-data
executor:
cores: 1
instances: 2
memory: "512m"
labels:
version: 3.0.0
volumeMounts:
- name: sparky-data
mountPath: /spark-data
volumes:
- name: sparky-data
hostPath:
path: /spark-data
I am running this in kind, where I have defined a volume mount to my local system where the data to be processed is present. I can see the volume being mounted in the kind nodes. But when I create the above resource, the driver pod crashes by giving the error 'no such path'. I printed the contents of the root directory of the driver pod and I could not see the mounted volume. What is the problem here and how do I fix this?
The issue is related to permissions. When mounting a volume to a pod, you need to make sure that the permissions are set correctly. Specifically, you need to make sure that the user or group that is running the application in the pod has the correct permissions to access the data.You should also make sure that the path to the volume is valid, and that the volume is properly mounted.To check if a path exists, you can use the exec command:
kubectl exec <pod_name> -- ls
Try to add security context which gives privilege and access control settings for a Pod
For more information follow this document.
Related
can anyone tell me how to config kuberenetes executor in local airflow deployment. I created a kind cluster named airflow-cluster and created the pod_template.yaml and made the following changes in the airflow.cfg.
[kubernetes]
# Path to the YAML pod file that forms the basis for KubernetesExecutor workers.
pod_template_file = /home/caxe/airflow/logs/yamls/pod_template.yaml
worker_container_repository = apache/airflow
worker_container_tag = 2.2.3
namespace = airflow
in_cluster = False
cluster_context = kind-airflow-cluster
config_file = /home/caxe/.kube/config
pod_template.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
containers:
- env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
# Hard Coded Airflow Envs
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
value: postgresql+psycopg2://airflow:airflow#localhost:5432/airflow
image: apache/airflow:2.2.3
imagePullPolicy: IfNotPresent
name: base
volumeMounts:
- mountPath: "/opt/airflow/logs"
name: airflow-logs
- mountPath: "/opt/airflow/dags"
name: airflow-dags
readOnly: true
- mountPath: "/opt/airflow/airflow.cfg"
name: airflow-config
readOnly: true
subPath: airflow.cfg
restartPolicy: Never
securityContext:
runAsUser: 50000
fsGroup: 50000
serviceAccountName: airflow
volumes:
- name: airflow-logs
persistentVolumeClaim:
claimName: logs-pv-claim
- name: airflow-dags
persistentVolumeClaim:
claimName: dag-pv-claim
- configMap:
name: k8s-config
name: airflow-config
Doesnt execute. On running kubectl get pods -n airflow
NAME READY STATUS RESTARTS AGE
examplebashoperatoralsorunthis.dd577351d4554c87923bc1eabe5e617e 0/1 Pending 0 114s
examplebashoperatorrunme0.afd364b8033643549a29ab536e9fc83f 0/1 Pending 0 116s
examplebashoperatorrunme1.47c97859639543bcab04a2ef0001ee9a 0/1 Pending 0 116s
examplebashoperatorrunme2.7296c3f011624f5ab62c1777187a006f 0/1 Pending 0 115s
examplebashoperatorthiswillskip.b9474f2673524a538ed2fddb6af00dd0 0/1 Pending 0 113s
I'm not a kubernetes guy, I have created the persistent volume and claim for logs and dags, but I think there could be a problem with the non cluster postgres connection. Because i have not configured postgres in cluster other than providing values in config and yaml file. Moreover the psycopg2 (apache-airflow[postgres]) is installed in local airflow, but since i havent modified the base image apace/airflow:2.2.3 could it be missing?
Setting up Kubernetes Executor needed Postgres to be running as a service in the cluster with the port forwarded, so that the Executor pod has access to it. It also required to install the additional package apache-airflow['postgres'] by creating a Dockerfile and using the base image of airflow. We also have to create persistent volumes and persistent volume claims where our dags and logs can be stored. Make sure the image we are using has the same version of python as in our local sytstem.
requirements.txt
apache-airflow-providers-cncf-kubernetes==3.0.1
apache-airflow-providers-postgres==2.4.0
Dockerfile
FROM apache/airflow:2.2.3-python3.8
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY test.py /opt/airflow/dags/
create the custom image with the tag eg. airflow-custom:1.0.0 using docker build -t airflow-custom:1.0.0 . and specify it in the pod template file as well.
I deployed my first container, I got info:
deployment.apps/frontarena-ads-deployment created
but then I saw my container creation is stuck in Waiting status.
Then I saw the logs using kubectl describe pod frontarena-ads-deployment-5b475667dd-gzmlp and saw MountVolume error which I cannot figure out why it is thrown:
Warning FailedMount 9m24s kubelet MountVolume.SetUp
failed for volume "ads-filesharevolume" : mount failed: exit status 32 Mounting command:
systemd-run Mounting arguments: --description=Kubernetes transient
mount for
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
--scope -- mount -t cifs -o username=frontarenastorage,password=mypassword,file_mode=0777,dir_mode=0777,vers=3.0
//frontarenastorage.file.core.windows.net/azurecontainershare
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
Output: Running scope as unit
run-rf54d5b5f84854777956ae0e25810bb94.scope. mount error(115):
Operation now in progress Refer to the mount.cifs(8) manual page (e.g.
man mount.cifs)
Before I run the deployment I created a secret in Azure, using the already created azure file share, which I referenced within the YAML.
$AKS_PERS_STORAGE_ACCOUNT_NAME="frontarenastorage"
$STORAGE_KEY="mypassword"
kubectl create secret generic fa-fileshare-secret --from-literal=azurestorageaccountname=$AKS_PERS_STORAGE_ACCOUNT_NAME --from-literal=azurestorageaccountkey=$STORAGE_KEY
In that file share I have folders and files which I need to mount and I reference azurecontainershare in YAML:
My YAML looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontarena-ads-deployment
labels:
app: frontarena-ads-deployment
spec:
replicas: 1
template:
metadata:
name: frontarena-ads-aks-test
labels:
app: frontarena-ads-aks-test
spec:
containers:
- name: frontarena-ads-aks-test
image: faselect-docker.dev/frontarena/ads:test1
imagePullPolicy: Always
ports:
- containerPort: 9000
volumeMounts:
- name: ads-filesharevolume
mountPath: /opt/front/arena/host
volumes:
- name: ads-filesharevolume
azureFile:
secretName: fa-fileshare-secret
shareName: azurecontainershare
readOnly: false
imagePullSecrets:
- name: fa-repo-secret
selector:
matchLabels:
app: frontarena-ads-aks-test
The Issue was because of the different Azure Regions in which AKS cluster and Azure File Share are deployed. If they are in the same Region you would not have this issue.
I have an Argo workflow that has two steps, the first runs on Linux and the second runs on Windows
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: my-workflow-v1.13
spec:
entrypoint: process
volumeClaimTemplates:
- metadata:
name: workdir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
arguments:
parameters:
- name: jobId
value: 0
templates:
- name: process
steps:
- - name: prepare
template: prepare
- - name: win-step
template: win-step
- name: win-step
nodeSelector:
kubernetes.io/os: windows
container:
image: mcr.microsoft.com/windows/nanoserver:1809
command: ["cmd", "/c"]
args: ["dir", "C:\\workdir\\source"]
volumeMounts:
- name: workdir
mountPath: /workdir
- name: prepare
nodeSelector:
kubernetes.io/os: linux
inputs:
artifacts:
- name: src
path: /opt/workdir/source.zip
s3:
endpoint: minio:9000
insecure: true
bucket: "{{workflow.parameters.jobId}}"
key: "source.zip"
accessKeySecret:
name: my-minio-cred
key: accesskey
secretKeySecret:
name: my-minio-cred
key: secretkey
script:
image: garthk/unzip:latest
imagePullPolicy: IfNotPresent
command: [sh]
source: |
unzip /opt/workdir/source.zip -d /opt/workdir/source
volumeMounts:
- name: workdir
mountPath: /opt/workdir
both steps share a volume.
To achieve that in Azure Kubernetes Service, I had to create two node pools, one for Linux nodes and another for Windows nodes
The problem is, when I queue the workflow, sometimes it completes, and sometimes, the win-step (the step that runs in the windows container), hangs/fails and shows this message
1 node(s) had volume node affinity conflict
I've read that this could happen because the volume gets scheduled on a specific zone and the windows container (since it's in a different pool) gets scheduled in a different zone that doesn't have access to that volume, but I couldn't find a solution for that.
Please help.
the first runs on Linux and the second runs on Windows
I doubt that you can mount the same volume on both Linux, typically ext4 file system and on a Windows node, Azure Windows containers uses NTFS file system.
So the volume that you try to mount in the second step, is located on the node pool that does not match your nodeSelector.
With spark-submit I launch application on a Kubernetes cluster. And I can see Spark-UI only when I go to the http://driver-pod:port.
How can I start Spark-UI History Server on a cluster?
How to make, that all running spark jobs are registered on the Spark-UI History Server.
Is this possible?
Yes it is possible. Briefly you will need to ensure following:
Make sure all your applications store event logs in a specific location (filesystem, s3, hdfs etc).
Deploy the history server in your cluster with access to above event logs location.
Now spark (by default) only read from the filesystem path so I will elaborate this case in details with spark operator:
Create a PVC with a volume type that supports ReadWriteMany mode. For example NFS volume. The following snippet assumes you have storage class for NFS (nfs-volume) already configured:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: spark-pvc
namespace: spark-apps
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 5Gi
storageClassName: nfs-volume
Make sure all your spark applications have event logging enabled and at the correct path:
sparkConf:
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "file:/mnt"
With event logs volume mounted to each application (you can also use operator mutating web hook to centralize it ) pod. An example manifest with mentioned config is show below:
---
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-java-pi
namespace: spark-apps
spec:
type: Java
mode: cluster
image: gcr.io/spark-operator/spark:v2.4.4
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar"
imagePullPolicy: Always
sparkVersion: 2.4.4
sparkConf:
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "file:/mnt"
restartPolicy:
type: Never
volumes:
- name: spark-data
persistentVolumeClaim:
claimName: spark-pvc
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 2.4.4
serviceAccount: spark
volumeMounts:
- name: spark-data
mountPath: /mnt
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 2.4.4
volumeMounts:
- name: spark-data
mountPath: /mnt
Install spark history server mounting the shared volume. Then you will have access events in history server UI:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: spark-history-server
namespace: spark-apps
spec:
replicas: 1
template:
metadata:
name: spark-history-server
labels:
app: spark-history-server
spec:
containers:
- name: spark-history-server
image: gcr.io/spark-operator/spark:v2.4.0
resources:
requests:
memory: "512Mi"
cpu: "100m"
command:
- /sbin/tini
- -s
- --
- /opt/spark/bin/spark-class
- -Dspark.history.fs.logDirectory=/data/
- org.apache.spark.deploy.history.HistoryServer
ports:
- name: http
protocol: TCP
containerPort: 18080
readinessProbe:
timeoutSeconds: 4
httpGet:
path: /
port: http
livenessProbe:
timeoutSeconds: 4
httpGet:
path: /
port: http
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: spark-pvc
readOnly: true
Feel free to configure Ingress, Service for accessing the UI.
Also you can use Google Cloud Storage, Azrue Blob Storage or AWS S3 as event log location. For this you will need to install some extra jars so I would recommend having a look at lightbend spark history server image and charts.
I am trying to mount an NFS volume to my pods but with no success.
I have a server running the nfs mount point, when I try to connect to it from some other running server
sudo mount -t nfs -o proto=tcp,port=2049 10.0.0.4:/export /mnt works fine
Another thing worth mentioning is when I remove the volume from the deployment and the pod is running. I log into it and i can telnet to 10.0.0.4 with ports 111 and 2049 successfully. so there really doesnt seem to be any communication problems
as well as:
showmount -e 10.0.0.4
Export list for 10.0.0.4:
/export/drive 10.0.0.0/16
/export 10.0.0.0/16
So I can assume that there is no network or configuration problems between the server and the client (I am using Amazon and the server that i tested on is in the same security group as the k8s minions)
P.S:
The server is a simple ubuntu->50gb disk
Kubernetes v1.3.4
So I start creating my PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
nfs:
server: 10.0.0.4
path: "/export"
And my PVC
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
here is how kubectl describes them:
Name: nfs
Labels: <none>
Status: Bound
Claim: default/nfs-claim
Reclaim Policy: Retain
Access Modes: RWX
Capacity: 50Gi
Message:
Source:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 10.0.0.4
Path: /export
ReadOnly: false
No events.
AND
Name: nfs-claim
Namespace: default
Status: Bound
Volume: nfs
Labels: <none>
Capacity: 0
Access Modes:
No events.
pod deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mypod
labels:
name: mypod
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
name: mypod
labels:
# Important: these labels need to match the selector above, the api server enforces this constraint
name: mypod
spec:
containers:
- name: abcd
image: irrelevant to the question
ports:
- containerPort: 80
env:
- name: hello
value: world
volumeMounts:
- mountPath: "/mnt"
name: nfs
volumes:
- name: nfs
persistentVolumeClaim:
claimName: nfs-claim
When I deploy my POD i get the following:
Volumes:
nfs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: nfs-claim
ReadOnly: false
default-token-6pd57:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6pd57
QoS Tier: BestEffort
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
13m 13m 1 {default-scheduler } Normal Scheduled Successfully assigned xxx-2140451452-hjeki to ip-10-0-0-157.us-west-2.compute.internal
11m 7s 6 {kubelet ip-10-0-0-157.us-west-2.compute.internal} Warning FailedMount Unable to mount volumes for pod "xxx-2140451452-hjeki_default(93ca148d-6475-11e6-9c49-065c8a90faf1)": timeout expired waiting for volumes to attach/mount for pod "xxx-2140451452-hjeki"/"default". list of unattached/unmounted volumes=[nfs]
11m 7s 6 {kubelet ip-10-0-0-157.us-west-2.compute.internal} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "xxx-2140451452-hjeki"/"default". list of unattached/unmounted volumes=[nfs]
Tried everything I know, and everything i can think of. What am i missing or doing wrong here?
I tested version 1.3.4 and 1.3.5 of Kubernetes and NFS mount didn't work for me. Later I switched to the 1.2.5 and that version gave me some more detailed info ( kubectl describe pod ...). It turned out that 'nfs-common' is missing in the hyperkube image. After I added nfs-common to all container instances based on hyperkube image on master and worker nodes the NFS share started to work normally (mount was successful). So that's the case here. I tested it in practice and it solved my problem.