How to enable Cassandra Password Authentication in Kubernetes deployment file - cassandra

I've been struggling with this for quite a while now. My effort so far is shown below. The env variable, CASSANDRA_AUTHENTICATOR, in my opinion, is supposed to enable password authentication. However, I'm still able to logon without a password after redeploying with this config. Any ideas on how to enable password authentication in a Kubernetes deployment file?
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
- name: CASSANDRA_AUTHENTICATOR
value: PasswordAuthenticator
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
The environment is Google Cloud Platform.

So I made few changes to the artifact you have mentioned:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: bitnami/cassandra:latest
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
- name: CASSANDRA_PASSWORD
value: pass123
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
The changes I made were:
image name has been changed to bitnami/cassandra:latest and then replaced the env CASSANDRA_AUTHENTICATOR with CASSANDRA_PASSWORD.
After you deploy the above artifact then I could authenticate as shown below
Trying to exec into pod
fedora#dhcp35-42:~/tmp/cassandra$ oc exec -it cassandra-2750650372-g8l9s bash
root#cassandra-2750650372-g8l9s:/#
Once inside the pod trying to authenticate with the server
root#cassandra-2750650372-g8l9s:/# cqlsh 127.0.0.1 9042 -p pass123 -u cassandra
Connected to Cassandra at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cassandra#cqlsh>
This image documentation can be found at https://hub.docker.com/r/bitnami/cassandra/
If you are not comfortable using the third party image and wanna use the image that upstream community manages then look for following solution, which is more DIY but also is more flexible.
To setup the password you were trying to use the env CASSANDRA_AUTHENTICATOR but this is not merged proposal yet for the image cassandra. You can see the open PRs here.
Right now the upstream suggest doing the mount of file cassandra.yaml at /etc/cassandra/cassandra.yaml, so that people can set whatever settings they want.
So follow the steps to do it:
Download the cassandra.yaml
I have made following changes to the file:
$ diff cassandra.yaml mycassandra.yaml
103c103
< authenticator: AllowAllAuthenticator
---
> authenticator: PasswordAuthenticator
Create configmap with that file
We have to create Kubernetes Configmap which then we will mount inside the container, we cannot do host mount similar to docker.
$ cp mycassandra.yaml cassandra.yaml
$ k create configmap cassandraconfig --from-file ./cassandra.yaml
The name of configmap is cassandraconfig.
Now edit the deployment to use this config and mount it in right place
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
- mountPath: /etc/cassandra/
name: cassandraconfig
volumes:
- name: data
emptyDir: {}
- name: cassandraconfig
configMap:
name: cassandraconfig
Once you create this deployment.
Now exec in the pod
$ k exec -it cassandra-1663662957-6tcj6 bash
root#cassandra-1663662957-6tcj6:/#
Try using the client
root#cassandra-1663662957-6tcj6:/# cqlsh 127.0.0.1 9042
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Remote end requires authentication.',)})
For more information on creating configMap and using it by mounting inside container you can read this doc, which helped me for this answer.

If you really don't want to replace official cassandra Docker image with bitnami's version, but you still want to enable password authentication for accessing CQL shell, then you could achieve that by modification of Cassandra configuration file. Namely, enablement of password authentication is done by setting the following property definition in /etc/cassandra/cassandra.yaml file: authenticator: PasswordAuthenticator
As it is irrelevant whether certain property is defined once or multiple times, i.e. at the end the latest property definition will be used, aforementioned line can be simply appended to Cassandra configuration file. Alternative could be using sed for performing interactive search-and-replace, but IMHO that would be unnecessary overkill - both performance-wise and readability-wise.
Long-story short, specify container startup-command/entrypoint (with its arguments) so that first is config file properly adapted and then is executed image's original startup-command/entrypoint. Since in the container definition of Docker-Compose and Kubernetes y(a)ml it is only possible to define single startup-command, specify as a command standard/Bourne shell executing previous two steps.
Therefore the answer would be adding the following two lines:
command: ["/bin/sh"]
args: ["-c", "echo 'authenticator: PasswordAuthenticator' >> /etc/cassandra
/cassandra.yaml && docker-entrypoint.sh cassandra -f"]
so the OP's Kubernetes deployment file would be the following:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
command: ["/bin/sh"]
args: ["-c", "echo 'authenticator: PasswordAuthenticator' >> /etc/cassandra/cassandra.yaml && docker-entrypoint.sh cassandra -f"]
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
Disclaimer: if 'latest' is used as the image tag of official Cassandra image, and if at some moment the original entrypoint (docker-entrypoint.sh cassandra -f) of image is changed, then this container might have issues starting Cassandra. However, since the entrypoint with its args is unchanged from the initial version until the latest version at the moment I was writing this post (4.0), it is very likely that it will remain as-is, so this approach/workaround should work fine.

Related

Kubernetes Crashloopbackoff With Minikube

So I am learning about Kubernetes with a guide, I am trying to deploy a MongoDB Pod with 1 replica. This is the deployment config file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb-deployment
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo
ports:
- containerPort: 27017
env:
- name: MONGO_INITDB_ROOT_USERNAME
valueFrom:
secretKeyRef:
name: mongodb-secret
key: mongo-root-username
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mongodb-secret
key: mongo-root-password
---
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
selector:
app: mongodb
ports:
- protocol: TCP
port: 27017
targetPort: 27017
I also try to deploy a Mongo-Express Pod with almost the same config file, but I keep getting CrashLoopBackOff for both Pods, From the little understanding I have, this is caused by the container failing and restarting in a cycle. I tried going through the events with kubectl get events and I see that a warning with message Back-off restarting failed container keeps occurring. I also tried doing a little digging around and came across a solution that says to add
command: ['sleep']
args: ['infinity']
That fixed the CrashLoopBackOff issue, but when I try to get the logs for the Pod, nothing is displayed on the terminal. Please I need some help and possible explanation as how the command and args seem to fix it, also how do I stop this crash from happening to my Pods and current one, Thank you very much.
My advice is to deploy MongoDB as StatefulSet on Kubernetes.
In stateful application, the N-replicas of master nodes manages several worker nodes under a cluster. So, if any master node goes down the other ordinal instances will be active to execute the workflow. The master node instances must be identified as a unique ordinal number known as StatefulSet.
See more: mongodb-sts, mongodb-on-kubernetes.
Also use Headless service to manage the domain of a Pod. In general understanding of Headless Service, there is no need for LoadBalancer or a kube-proxy to interact directly with Pods but using a Service IP, so the Cluster IP is set to none.
In your case:
apiVersion: v1
kind: Service
metadata:
name: mongodb
spec:
clusterIP: None
selector:
app: mongodb
ports:
- port: 27017
The error:
Also uncaught exception: Error: couldn't add user: Error preflighting normalization: U_STRINGPREP_PROHIBITED_ERROR _getErrorWithCode#src/mongo/shell/utils.js:25:13
indicates that the secret may be missing. Take a look: mongodb-initializating.
In your case secret should look similar:
apiVersion: v1
kind: Secret
metadata:
name: mongodb-secret
type: Opaque
data:
mongo-root-username: YWRtaW4=
mongo-root-password: MWYyZDFlMmU2N2Rm
Remember to configure also a volume for your pods - follow tutorials I have linked above.
Deploy mongodb with StatefulSet not as deployment.
Example:
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
labels:
name: mongo
spec:
ports:
port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongod
spec:
serviceName: mongodb-service
replicas: 3
template:
metadata:
labels:
role: mongo
environment: test
replicaset: MainRepSet
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: replicaset
operator: In
values:
- MainRepSet
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 10
volumes:
- name: secrets-volume
secret:
secretName: shared-bootstrap-data
defaultMode: 256
containers:
- name: mongod-container
#image: pkdone/mongo-ent:3.4
image: mongo
command:
- "numactl"
- "--interleave=all"
- "mongod"
- "--wiredTigerCacheSizeGB"
- "0.1"
- "--bind_ip"
- "0.0.0.0"
- "--replSet"
- "MainRepSet"
- "--auth"
- "--clusterAuthMode"
- "keyFile"
- "--keyFile"
- "/etc/secrets-volume/internal-auth-mongodb-keyfile"
- "--setParameter"
- "authenticationMechanisms=SCRAM-SHA-1"
resources:
requests:
cpu: 0.2
memory: 200Mi
ports:
- containerPort: 27017
volumeMounts:
- name: secrets-volume
readOnly: true
mountPath: /etc/secrets-volume
- name: mongodb-persistent-storage-claim
mountPath: /data/db
volumeClaimTemplates:
metadata:
name: mongodb-persistent-storage-claim
annotations:
volume.beta.kubernetes.io/storage-class: "standard"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi

Kubernetes: Cassandra(stateful set) deployment on GCP

Has anyone tried deploying Cassandra (POC) on GCP using kubernetes (not GKE). If so can you please share info on how to get it working?
You could start by looking at IBM's Scalable-Cassandra-deployment-on-Kubernetes.
For seeds discovery you can use a headless service, similar to this Multi-node Cassandra Cluster Made Easy with Kubernetes.
Some difficulties:
fast local storage for K8s is still in beta; of course, you can use what k8s already has; there are some users reporting that they use Ceph RBD with 8 C* nodes each of them having 2TB of data on K8s.
at some point in time you will realize that you need a C* operator - here is some good startup - Instaclustr's Cassandra Operator and Pantheon Systems' Cassandra Operator
you need a way to scale in gracefully stateful applications (should be also covered by the operator; this is a solution if you don't want an operator, but you still need to use a controller).
You could also check the Cassandra mailing list, since there are people there already using Cassandra over K8s in production.
I have implemented cassandra on kubernetes. Please find my deployment and service yaml files:
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v12
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "fast"
resources:
requests:
storage: 5Gi
Hope this helps.
Use Helm:
On Mac:
brew install helm#2
brew link --force helm#2
helm init
To Avoid Kubernetes Helm permission Hell:
from: https://github.com/helm/helm/issues/2224:
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
Cassandra
incubator:
helm repo add https://github.com/helm/charts/tree/master/incubator/cassandra
helm install --namespace "cassandra" -n "cassandra" incubator/cassandra
helm status "cassandra"
helm delete --purge "cassandra"
bitnami:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install --namespace "cassandra" -n "my-deployment" bitnami/cassandra
helm status "my-deployment"
helm delete --purge "my-deployment"

How to deploy a node.js with redis on kubernetes?

I have a very simple node.js application (HTTP service), which "talks" to redis. I want to create a deployment and run it with minikube.
From my understanding, I need a kubernetes Pod for my app, based on the docker image. Here's my Dockerfile:
FROM node:8.9.1
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 8080
CMD ["npm", "start"]
I build the docker image with docker build -t my-app .
Next, I created a Pod definition for my app's Pod:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: my-app:latest
imagePullPolicy: Never
ports:
- containerPort: 8080
So far, so good. But from now on, I have no clear idea how to proceed with redis:
should redis be another Pod, or a Service (in terms of Kubernetes kind)?
How do I reference redis from inside my app? Based on whether redis will be defined as a Pod/Service, how do I obtain a connection URL and port? I read about environment variables being created by Kubernetes, but I am not sure whether these work for Pods or Services.
How do I aggregate both (my app & redis) under single configuration? How do I make sure that redis starts first, then my app (which requires running redis instance), and how do I expose my HTTP endpoints to the "outside world"? I read about Deployments, but I am not sure how to connect these pieces together.
Ideally, I would like to have all configurations inside YAML files, so that at the end of the day the whole infrastructure could be started with a single command.
I think I figured out a solution (using a Deployment and a Service).
For my deployment, I used two containers (webapp + redis) within one Pod, since it doesn't make sense for a webapp to run without active redis instance, and additionally it connects to redis upon application start. I could be wrong in this reasoning, so feel free to correct me if you think otherwise.
Here's my deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-deployment
spec:
selector:
matchLabels:
app: my-app-deployment
template:
metadata:
labels:
app: my-app-deployment
spec:
containers:
- name: redis
image: redis:latest
ports:
- containerPort: 6379
volumeMounts:
- mountPath: /srv/www
name: redis-storage
- name: my-app
image: my-app:latest
imagePullPolicy: Never
ports:
- containerPort: 8080
volumes:
- name: redis-storage
emptyDir: {}
And here's the Service definition:
apiVersion: v1
kind: Service
metadata:
name: my-app-service
spec:
ports:
- port: 8080
protocol: TCP
type: NodePort
selector:
app: my-app-deployment
I create the deployment with:
kubectl create -f deployment.yaml
Then, I create the service with kubectl create -f service.yaml
I read the IP with minikube ip and extract the port from the output of kubectl describe service my-app-service.
I agree with all of the previous answers. I'm just trying to things more simple by executing a single command.
First, create necessary manifests for redis in a file say redis.yaml and service to expose it outside.
apiVersion: v1
kind: Service
metadata:
name: redis
labels:
app: node-redis
spec:
ports:
- name: redis
port: 6379
targetPort: 6379
type: NodePort
selector:
app: node-redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
selector:
matchLabels:
app: node-redis
replicas: 1
template:
metadata:
labels:
app: node-redis
spec:
containers:
- name: redis
image: redis:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 6379
# data volume where redis writes data
volumeMounts:
- name: data
mountPath: /data
readOnly: false
volumes:
- name: data
persistentVolumeClaim:
claimName: redis-data
---
# data volume
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-data
labels:
app: node-redis
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
Next put manifests for your app in another file say my-app.yaml. Here i put the volume field so that you can use the data that stored by redis.
apiVersion: v1
kind: Pod
metadata:
name: my-app
labels:
app: node-redis
spec:
containers:
- name: my-app
image: my-app:latest
ports:
- containerPort: 8080
# data volume from where my-app read data those are written by redis
volumeMounts:
- name: data
mountPath: /data
readOnly: false
volumes:
- name: data
persistentVolumeClaim:
claimName: redis-data
Now we can use the following bash file my-app.sh.
#!/bin/bash
kubectl create -f redis.yaml
pod_name=$(kubectl get po -l app=node-redis | grep app-with-redis | awk '{print $1}')
# check whether redis server is ready or not
while true; do
pong=$(kubectl exec -it $pod_name -c redis redis-cli ping)
if [[ "$pong" == *"PONG"* ]]; then
echo ok;
break
fi
done
kubectl create -f my-app.yaml
Just run chmod +x my-app.sh; ./my-app.sh to deploy. To get the url run minikube service redis --url. You can similarly get the url for your app. The only thing is you need a nodePort type service for your app to access it from outside of the cluster.
So, everything is in your hand now.
I would run redis in a separate pod (i.e.: so your web app doesn't take down the redis server if itself crashes).
Here is your redis deployment & service:
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
selector:
matchLabels:
app: redis
replicas: 1
template:
metadata:
labels:
app: redis
spec:
volumes:
- name: host-sys
hostPath:
path: /sys
initContainers:
- name: disable-thp
image: redis:4.0-alpine
volumeMounts:
- name: host-sys
mountPath: /host-sys
command: ["sh", "-c", "echo never > /host-sys/kernel/mm/transparent_hugepage/enabled"]
containers:
- name: redis
image: redis:4.0-alpine
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 350m
memory: 1024Mi
ports:
- containerPort: 6379
service.yaml:
apiVersion: v1
kind: Service
metadata:
name: redis
labels:
app: redis
spec:
ports:
- port: 6379
name: redis
selector:
app: redis
Since we've exposed a kubernetes Service you can then access your redis instance by hostname, or it's "service name", which is redis.
You can check out my kubernetes redis repository at https://github.com/mateothegreat/k8-byexamples-redis. You can simply run make install if you want the easier route.
Good luck and if you're still stuck please reach out!
yes you need a separete deployement and service for redis
use kubernetes service discovery , should be built in , KubeDNS , CoreDNS
use readniness and liveness probes
Yes , you can write a single big yaml file to describe all the deployments and services. then:
kubectl apply -f yourfile.yml
or you can place the yaml in separate files and then do the :
kubectl apply -f dir/
I recommend you to read further the k8s docs, but in general re your questions raised above:
Yes another pod (with the relevant configuration) and an additional service depends on your use case, check this great example: https://kubernetes.io/docs/tutorials/configuration/configure-redis-using-configmap/
Using services, read more here: https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/
There are several ways to manage dependencies - search for deployment dependencies, but in general you can append them in the same file with readiness endpoint and expose using a Service - read more in the link in bullet 2

Spark UI History server on Kubernetes?

With spark-submit I launch application on a Kubernetes cluster. And I can see Spark-UI only when I go to the http://driver-pod:port.
How can I start Spark-UI History Server on a cluster?
How to make, that all running spark jobs are registered on the Spark-UI History Server.
Is this possible?
Yes it is possible. Briefly you will need to ensure following:
Make sure all your applications store event logs in a specific location (filesystem, s3, hdfs etc).
Deploy the history server in your cluster with access to above event logs location.
Now spark (by default) only read from the filesystem path so I will elaborate this case in details with spark operator:
Create a PVC with a volume type that supports ReadWriteMany mode. For example NFS volume. The following snippet assumes you have storage class for NFS (nfs-volume) already configured:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: spark-pvc
namespace: spark-apps
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 5Gi
storageClassName: nfs-volume
Make sure all your spark applications have event logging enabled and at the correct path:
sparkConf:
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "file:/mnt"
With event logs volume mounted to each application (you can also use operator mutating web hook to centralize it ) pod. An example manifest with mentioned config is show below:
---
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-java-pi
namespace: spark-apps
spec:
type: Java
mode: cluster
image: gcr.io/spark-operator/spark:v2.4.4
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar"
imagePullPolicy: Always
sparkVersion: 2.4.4
sparkConf:
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "file:/mnt"
restartPolicy:
type: Never
volumes:
- name: spark-data
persistentVolumeClaim:
claimName: spark-pvc
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 2.4.4
serviceAccount: spark
volumeMounts:
- name: spark-data
mountPath: /mnt
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 2.4.4
volumeMounts:
- name: spark-data
mountPath: /mnt
Install spark history server mounting the shared volume. Then you will have access events in history server UI:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: spark-history-server
namespace: spark-apps
spec:
replicas: 1
template:
metadata:
name: spark-history-server
labels:
app: spark-history-server
spec:
containers:
- name: spark-history-server
image: gcr.io/spark-operator/spark:v2.4.0
resources:
requests:
memory: "512Mi"
cpu: "100m"
command:
- /sbin/tini
- -s
- --
- /opt/spark/bin/spark-class
- -Dspark.history.fs.logDirectory=/data/
- org.apache.spark.deploy.history.HistoryServer
ports:
- name: http
protocol: TCP
containerPort: 18080
readinessProbe:
timeoutSeconds: 4
httpGet:
path: /
port: http
livenessProbe:
timeoutSeconds: 4
httpGet:
path: /
port: http
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: spark-pvc
readOnly: true
Feel free to configure Ingress, Service for accessing the UI.
Also you can use Google Cloud Storage, Azrue Blob Storage or AWS S3 as event log location. For this you will need to install some extra jars so I would recommend having a look at lightbend spark history server image and charts.

Configuring Azure log analytics

I am following this documentation https://learn.microsoft.com/en-us/azure/aks/tutorial-kubernetes-monitor to configure a monitoring solution on AKS with the following yaml file
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: omsagent
spec:
template:
metadata:
labels:
app: omsagent
agentVersion: 1.4.0-12
dockerProviderVersion: 10.0.0-25
spec:
containers:
- name: omsagent
image: "microsoft/oms"
imagePullPolicy: Always
env:
- name: WSID
value: <WSID>
- name: KEY
value: <KEY>
securityContext:
privileged: true
ports:
- containerPort: 25225
protocol: TCP
- containerPort: 25224
protocol: UDP
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-sock
- mountPath: /var/opt/microsoft/omsagent/state/containerhostname
name: container-hostname
- mountPath: /var/log
name: host-log
- mountPath: /var/lib/docker/containers/
name: container-log
livenessProbe:
exec:
command:
- /bin/bash
- -c
- ps -ef | grep omsagent | grep -v "grep"
initialDelaySeconds: 60
periodSeconds: 60
nodeSelector:
beta.kubernetes.io/os: linux
# Tolerate a NoSchedule taint on master that ACS Engine sets.
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: "true"
effect: "NoSchedule"
volumes:
- name: docker-sock
hostPath:
path: /var/run/docker.sock
- name: container-hostname
hostPath:
path: /etc/hostname
- name: host-log
hostPath:
path: /var/log
- name: container-log
hostPath:
path: /var/lib/docker/containers/
This fails with an error
error: error converting YAML to JSON: yaml: line 65: mapping values are not allowed in this context
I've verified that the file is syntactically correct using a yaml validator, no sure whats wrong?
This is kubernetes version 1.7
This also happens with version 1.9
That yaml file works for me:
[root#jasoncli#jasonye aksoms]# vi oms-daemonset.yaml
[root#jasoncli#jasonye aksoms]# kubectl create -f oms-daemonset.yaml
daemonset "omsagent" created
[root#jasoncli#jasonye aksoms]# kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
omsagent 1 1 0 1 0 beta.kubernetes.io/os=linux 1m
Please check your kubectl client version with this command kubectl version, here is my output:
[root#jasoncli#jasonye aksoms]# kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.5", GitCommit:"cce11c6a185279d037023e02ac5249e14daa22bf", GitTreeState:"clean", BuildDate:"2017-12-07T16:16:03Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.7", GitCommit:"8e1552342355496b62754e61ad5f802a0f3f1fa7", GitTreeState:"clean", BuildDate:"2017-09-28T23:56:03Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
You can run the following command az aks install-cli to install kubectl client locally.
More information about install kubernetes command-line client, please refer to this article.

Resources