Helm - Spark operator examples/spark-pi.yaml does not exist - apache-spark

I've deployed Spark Operator to GKE using the Helm Chart to a custom namespace:
helm install --name sparkoperator incubator/sparkoperator --namespace custom-ns --set sparkJobNamespace=custom-ns
and confirmed the operator running in the cluster with helm status sparkoperator.
However when I'm trying to run the Spark Pi example kubectl apply -f examples/spark-pi.yaml I'm getting the following error:
the path "examples/spark-pi.yaml" does not exist
There are few things that I probably still don't get:
Where is actually examples/spark-pi.yaml located after deploying the operator?
What else should I check and what other steps should I take to make the example work?

Please find the spark-pi.yaml file here.
You should copy it to your filesystem, customize it if needed, and provide a valid path to it with kubectl apply -f path/to/spark-pi.yaml.

kubectl apply needs a yaml file either locally on the system where you are running kubectl command or it can be a http/https endpoint hosting the file.

Related

Installing nginx ingress controller into AKS cluster - can't pull image from Azure Container Registry - 401 Unauthorized

I'm trying to install an nginx ingress controller into an Azure Kubernetes Service cluster using helm. I'm following this Microsoft guide. It's failing when I use helm to try to install the ingress controller, because it needs to pull a "kube-webhook-certgen" image from a local Azure Container Registry (which I created and linked to the cluster), but the kubernetes pod that's initially scheduled in the cluster fails to pull the image and shows the following error when I use kubectl describe pod [pod_name]:
failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]
This section describes using helm to create an ingress controller.
The guide describes creating an Azure Container Registry, and link it to a kubernetes cluster, which I've done successfully using:
az aks update -n myAKSCluster -g myResourceGroup --attach-acr <acr-name>
I then import the required 3rd party repositories successfully into my 'local' Azure Container Registry as detailed in the guide. I checked that the cluster has access to the Azure Container Registry using:
az aks check-acr --name MyAKSCluster --resource-group myResourceGroup --acr letsencryptdemoacr.azurecr.io
I also used the Azure Portal to check permissions on the Azure Container Registry and the specific repository that has the issue. It shows that both the cluster and repository have the ACR_PULL permission)
When I run the helm script to create the ingress controller, it fails at the point where it's trying to create a kubernetes pod named nginx-ingress-ingress-nginx-admission-create in the ingress-basic namespace that I created. When I use kubectl describe pod [pod_name_here], it shows the following error, which prevents creation of the ingress controller from continuing:
Failed to pull image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen:v1.5.1#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": [rpc error: code = NotFound desc = failed to pull and unpack image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068: not found, rpc error: code = Unknown desc = failed to pull and unpack image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]
This is the helm script that I run in a linux terminal:
helm install nginx-ingress ingress-nginx/ingress-nginx --namespace ingress-basic --set controller.replicaCount=1 --set controller.nodeSelector."kubernetes\.io/os"=linux --set controller.image.registry=$ACR_URL --set controller.image.image=$CONTROLLER_IMAGE --set controller.image.tag=$CONTROLLER_TAG --set controller.image.digest="" --set controller.admissionWebhooks.patch.nodeSelector."kubernetes\.io/os"=linux --set controller.admissionWebhooks.patch.image.registry=$ACR_URL --set controller.admissionWebhooks.patch.image.image=$PATCH_IMAGE --set controller.admissionWebhooks.patch.image.tag=$PATCH_TAG --set defaultBackend.nodeSelector."kubernetes\.io/os"=linux --set defaultBackend.image.registry=$ACR_URL --set defaultBackend.image.image=$DEFAULTBACKEND_IMAGE --set defaultBackend.image.tag=$DEFAULTBACKEND_TAG --set controller.service.loadBalancerIP=$STATIC_IP --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-dns-label-name"=$DNS_LABEL
I'm using the following relevant environment variables:
$ACR_URL=letsencryptdemoacr.azurecr.io
$PATCH_IMAGE=jettech/kube-webhook-certgen
$PATCH_TAG=v1.5.1
How do I fix the authorization?
It seems like the issue is caused by the new ingress-nginx/ingress-nginx helm chart release. I have fixed it by using version 3.36.0 instead of the latest (4.0.1).
helm upgrade -i nginx-ingress ingress-nginx/ingress-nginx \
--version 3.36.0 \
...
Azure support identified and provided a solution to this and essentially confirmed that the documentation in the Microsoft tutorial is at best now outdated against the current Helm release for the ingress controller.
The full error message I was getting was similar to the following, which indicates that the first error encountered is actually that the image is NotFound. The message about Unauthorized is actually misleading. The issue appears to be that the install references 'digests' for a couple of the images required by the install (basically the digest is a unique identifier for the image). The install appears to have been using digests of the docker images from the original location, and not the digest of my copy of the images that I imported into the Azure Container Registry. This obviously then doesn't work, as the digests of the images the install is trying to pull don't match the digests of the images that are imported to my Azure Container Registry.
Failed to pull image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen:v1.5.1#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': [rpc error: code = NotFound desc = failed to pull and unpack image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to resolve reference 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068: not found, rpc error: code = Unknown desc = failed to pull and unpack image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to resolve reference 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen#sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]
The generated digest for the images that I'd imported into my local Azure Container Registry needed to be specified as additional arguments to the helm install:
--set controller.image.digest="sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899" \
--set controller.admissionWebhooks.patch.image.digest="sha256:950833e19ade18cd389d647efb88992a7cc077abedef343fa59e012d376d79b7" \
I then had a 2nd issue where I was getting CrashLoopBackoff for the ingress controller pod. I fixed this by re-importing a different version of the ingress controller image than the one referenced in the tutorial, as follows:
set environment variable used to identify the tag to pull for the ingress controller image
CONTROLLER_TAG=v1.0.0
delete the ingress repository from the Azure Container Registry (I did this manually via the portal), then re-import it using the following (the values of other variables should be as specified in the Microsoft tutorial):
az acr import --name $REGISTRY_NAME --source $CONTROLLER_REGISTRY/$CONTROLLER_IMAGE:$CONTROLLER_TAG --image $CONTROLLER_IMAGE:$CONTROLLER_TAG
Make sure you guys set all the digests to empty
--set controller.image.digest=""
--set controller.admissionWebhooks.patch.image.digest=""
--set defaultBackend.image.digest=""
Basically, this will pull the image <your-registry>.azurecr.io/ingress-nginx/controller:<version> without the #digest:<digest>
The other problem, if you use the latest chart version, the deployment will crash into CRASHLOOPBACKOFF status. Checking the live log of the pod, you will see the problem with flags, eg Unknown flag --controller-class. To resolve this problem, you could specify the -version flag in the helm install command to use the version 3.36.0. All deployment problems should be resolved.
Faced the same issue on AWS and using a older version of the helm chart helped.
I used the version 3.36.0 and it worked fine.

Can I deploy using JSON string in Kubernetes?

As per kubectl documentation, kubectl apply is possible by using a file or stdin. My usecase is that there would be service/deployment json strings in runtime and I have to deploy those in clusters using nodejs. Of course, I can create files and just do kubectl apply -f thefilename. But, I don't want to create files. Is there any approach where I can do like below:
kubectl apply "{"apiVersion": "extensions/v1beta1","kind": "Ingress"...}"
For the record, I am using node_ssh library.
echo 'your manifest' | kubectl create -f -
Reference:
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#apply

Use GitHub Actions and Helm to deploy to AKS

I have setup an Azure Kubernetes Service and manually successfully deployed multiple Helm charts.
I now want to setup up a CD pipeline using GitHub Actions and Helm to deploy (that is install and upgrade) a Helm chart whenever the Action is triggers.
Up until now I only found Actions that use kubectl for deployment, which I don't want to use, because there are some secrets provided in the manifests that I don't want to check into version control, hence the decision for Helm as it can fill these secrets with values provided as environmental variables when running the helm install command:
# without Helm
...
clientId: secretValue
# with Helm
...
clientId: {{ .Values.clientId }}
The "secret" would be provided like this: helm install --set clientId=secretValue.
Now the question is how can I achieve this using GitHub Actions? Are there any "ready-to-use" solutions available that I just haven't found or do I have to approach this in a completely different way?
Seems like I made things more complicated than I needed.
I ended up with writing a simple GitHub Action based on the alpine/helm docker image and was able to successfully setup the CD pipeline into AKS.

spark docker-image-tool Cannot find docker image

I deployed spark on kuberenets
helm install microsoft/spark --version 1.0.0 (also tried bitnami chart with the same result)
then, as is described https://spark.apache.org/docs/latest/running-on-kubernetes.html#submitting-applications-to-kubernetes
i go to $SPARK_HOME/bin
docker-image-tool.sh -r -t my-tag build
this returns
Cannot find docker image. This script must be run from a runnable distribution of Apache Spark.
but all spark runnables are in this directory.
bash-4.4# cd $SPARK_HOME/bin
bash-4.4# ls
beeline find-spark-home.cmd pyspark.cmd spark-class spark-shell.cmd spark-sql2.cmd sparkR
beeline.cmd load-spark-env.cmd pyspark2.cmd spark-class.cmd spark-shell2.cmd spark-submit sparkR.cmd
docker-image-tool.sh load-spark-env.sh run-example spark-class2.cmd spark-sql spark-submit.cmd sparkR2.cmd
find-spark-home pyspark run-example.cmd spark-shell spark-sql.cmd spark-submit2.cmd
any suggestions what am i doing wrong?
i haven't made any other configurations with spark, am i missing something? should i install docker myself, or any other tools?
You are mixing things here.
When you run helm install microsoft/spark --version 1.0.0 you're deploying Spark with all pre-requisites inside Kubernetes. Helm is doing all hard work for you. After you run this, Spark is ready to use.
Than after you deploy Spark using Helm you are trying to deploy Spark from inside a Spark pod that is already running on Kubernetes.
These are two different things that are not meant to be mixed. This guide is explaining how to run Spark on Kubernetes by hand but fortunately it can be done using Helm as you did before.
When you run helm install myspark microsoft/spark --version 1.0.0, the output is telling you how to access your spark webui:
NAME: myspark
LAST DEPLOYED: Wed Apr 8 08:01:39 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get the Spark URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace default -w myspark-webui'
export SPARK_SERVICE_IP=$(kubectl get svc --namespace default myspark-webui -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SPARK_SERVICE_IP:8080
2. Get the Zeppelin URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace default -w myspark-zeppelin'
export ZEPPELIN_SERVICE_IP=$(kubectl get svc --namespace default myspark-zeppelin -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$ZEPPELIN_SERVICE_IP:8080
Let's check it:
$ export SPARK_SERVICE_IP=$(kubectl get svc --namespace default myspark-webui -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ echo http://$SPARK_SERVICE_IP:8080
http://34.70.212.182:8080
If you open this URL you have your Spark webui ready.

ERROR: (gcloud.run.deploy) Error parsing [service]. The [service] resource is not properly specified. Failed to find attribute [service]

I am trying to deploy my nodejs application to cloud run using the following command
gcloud run deploy --image gcr.io/[project-id]/helloworld --platform managed
Before running this command I built two cloud build images and trying to deploy the latest build using above command. But getting the following error
ERROR: (gcloud.run.deploy) Error parsing [service].
The [service] resource is not properly specified.
Failed to find attribute [service]. The attribute can be set in the following ways:
- provide the argument [SERVICE] on the command line
- specify the service name from an interactive prompt
I don't know which is causing error. Can anybody help me with this? Thank you
Try specifying the Service ID as an argument, replacing my-service with the desired name:
gcloud run deploy my-service --image gcr.io/[project-id]/helloworld --platform managed
Also, make sure you're using the latest Cloud SDK with gcloud components update.
$ gcloud run deploy --image gcr.io/cloudrun/hello --platform managed should prompt you to pick a platform, pick a region and a service name. Please try with this command.
In your command, make sure to replace [project-id] with you GCP project ID.

Resources