Docker-compose, conditional statements? (e.g. system dependent driver)

Docker-compose, conditional statements? (e.g. system dependent driver) - azure

This is related to Docker-compose, conditional statements? (e.g. add volume only if condition) question but I prefer to ask a new one because my problem is a bit different:
I have a docker-compose.yml with multiple services and multiple storage.
I'd deploy it locally or on Azure ACI.
I'd like to have only one docker-compose.yml , but Azure ACI needs to specify the storage driver:
mysql-data:
driver: azure_file
driver_opts:
share_name: mysql
storage_account_name: mystore
obviously the driver does not exist locally.
Using a variable for the driver like:
mysql-data:
driver: ${STORAGE_DRIVER}
driver_opts:
share_name: mysql
storage_account_name: mystore
Gives an error because the local overlay2 driver doesn't have share_name nor storage_account_name properties.
How can I achieve my problem and keeping only one docker-compose.yml multi-containers deployement ?
Thank you.

Related

Vaultwarden in Docker swarm with persistent storage on each worker node, issue

I have 4 nodes - one is manager and three are workers. On my three worker nodes I have configured lsyncd with rsync -u flag (so it is not syncing data if on the remote folder the version of file is newer) and delete=false. The daemon syncs /home/user/mydocker/vaultwarden/data across all worker nodes bidirectional. Syncing works stunning (I also tried with GlusterFS).
My idea is having only one replica on my worker node, and in case of failure Docker swarm gets UP a service on another node, and with synced information I should get the same copy of Vaultwarden with data inside. And it works with one exception - seems that once, for example, when I reboot the node, where service is, Docker redeploys the container on another node and it gets the data from some kinda cache which replaces all the data on my synced folder and since there the data has newer version, lsyncd syncs data to the other nodes. So, in this case I get a clear Vaultwarden without any data or if there was some data before it restores to the previous version. BUT if I manually get up the Vaultwarden with docker compose and then turn off the node (simulate a failure for example) and making the service UP on another node with docker compose, all works like a charm - the data persists and syncs without any problems.
My yaml config for the deploy:
version: '3'
services:
vaultwarden:
image: vaultwarden/server:latest
environment:
- ADMIN_TOKEN=XXXXXXXXXXXXX
- SIGNUPS_ALLOWED=true
volumes:
- /home/user/mydocker/vaultwarden/data:/data
ports:
- "8877:80"
deploy:
placement:
constraints:
- "node.role==worker
mode: replicated
replicas: 1

Issue mounting NFS share using Apache Spark 3.1.1 running on Kubernetes 1.21

From a jupyter notebook I am creating a spark context which deploys spark on kubernetes. This has been working fine for some time. I am now trying to configure the spark context so that the driver and executors mount an nfs share to a local directory. Note the nfs share i am trying to mount has been in use for some time both via my k8 cluster as well as via other means.
According to the official documentation and release article for 3.1.x I should be able to modify my spark conf with options that are in turn passed to kubernetes.
My spark conf in this example is set as:
sparkConf.set(f"spark.kubernetes.driver.volumes.nfs.myshare.mount.readOnly", "false")
sparkConf.set(f"spark.kubernetes.driver.volumes.nfs.myshare.mount.path", "/deltalake")
sparkConf.set(f"spark.kubernetes.driver.volumes.nfs.myshare.options.server", "15.4.4.1")
sparkConf.set(f"spark.kubernetes.driver.volumes.nfs.myshare.options.path", "/deltalake")
In my scenario the nfs share is "15.4.4.1:/deltalake" and I arbitrarily selected the name myshare to represent this nfs mount.
When I describe the pods created when I instantiate the spark context I to not see any mounts resembling these directives.
# kubectldescribe <a-spark-pod>
...
Volumes:
spark-conf-volume-exec:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: spark-exec-85efd381ea403488-conf-map
Optional: false
spark-local-dir-1:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-947xd:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
I also do not see anything in the logs for the pod indicating an issue.
Update:
I missed a key line of the documentation which states that drivers and executors have different configs.
The configuration properties for mounting volumes into the executor pods use prefix spark.kubernetes.executor. instead of spark.kubernetes.driver.
The second thing I missed is that the docker container being used by the spark conf to provision the kubernetes pod which hosts the spark executors needs to have the software installed to mount nfs servers (ie run the command line utility to mount an nfs share). The spark integration solution will silently fail in the event that the nfs utils are not installed. If we describe the pod in this scenario, the pod will list an nfs volume and if we execute code on each executor to list the contents of the mount dir, the mount path will show an empty directory. There is not indication of the failure if we describe the pod or look at the logs of the pod.
I am rebuilding the container images and will try again

There are a few things needed to get this to work:
The spark conf needs to be configured for the driver and executor
The nfs utils package needs to be installed on the driver and executor nodes
The nfs server needs to be active and properly configured to allow connections
There are a few possible problems:
The mount does not succeed (server offline, path doesn't exist, path in use)
As a workaround:
After the spark session is created, run a shell command on all the workers to confirm they have access to the mount and the contents look right.

Persistent Volume Claim Kubernetes

Kubectl version gives the following output.
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:04:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
I have used kubectl to edit persistent volume from 8Gi to 30Gi as
However, when I exec the pod and run df -h I see the following:
I have deleted the pods but it again shows the same thing. if I cd into cd/dev I don't see disk and vda1 folder there as well. I think I actually want the bitnami/influxdb to be 30Gi. Please guide and let me know if more info is needed.

This is a community wiki answer posted for better visibility. Feel free to expand it.
Based on the comments provided here, there could be several reasons for this behavior.
According to the documentation from the Kubernetes website, manually changing the PersistentVolume size will not change the volume size:
Warning: Directly editing the size of a PersistentVolume can prevent
an automatic resize of that volume. If you edit the capacity of a
PersistentVolume, and then edit the .spec of a matching
PersistentVolumeClaim to make the size of the PersistentVolumeClaim
match the PersistentVolume, then no storage resize happens. The
Kubernetes control plane will see that the desired state of both
resources matches, conclude that the backing volume size has been
manually increased and that no resize is necessary.
It also depends on how Kubernetes running and support for the allowVolumeExpansion feature. From DigitalOcean:
are you running one of DigitalOcean's managed clusters, or a DIY
cluster running on DigitalOcean infrastructure? In case of the latter,
which version of our CSI driver do you use? (You need v1.2.0 or later
for volume expansion to be supported.)

How to dynamically manage prometheus file_sd_configs in docker container?

I have been using targets.json inside a node.js application running locally to dynamically add ip addresses for prometheus to probe service discovery as file_sd_configs option. It has worked well. I was able add new ip's and execute the prometheus reload api from the node app, monitor those ip's and issue alerts(with blackbox and alertmanager).
However, now the application and prometheus are running inside docker on same network. How can I make my node application write to a file(or update it) inside a folder in prometheus container?

You could bind the target.json file to the Prometheus and the application container by adding a volume mapping to your docker-compose file.
volumes:
- /hostpath/target.json:/containerpath/target.json
Instead of using a mapped hostsystem folder you can also use named volumes, see here for more information about docker volumes.

Share volumes between docker stacks?

I have two different docker stacks, one for HBase and one for Spark. I need to get the HBase jars into the spark path. One way that I can do this, without having to modify the spark containers is to use a volume. In my docker-compose.yml for HBase, I have defined a volume that points to the HBase home (it happens to be /opt/hbase-1.2.6). Is it possible to share that volume with the spark stack?
Right now, since the service names are different (2 different docker-compose files) the volumes are being prepended (hbase_hbasehome and spark_hbasehome) causing the share to fail.

You could use an external volume. See here the official documentation:
if set to true, specifies that this volume has been created outside of
Compose. docker-compose up does not attempt to create it, and raises
an error if it doesn’t exist.
external cannot be used in conjunction with other volume configuration
keys (driver, driver_opts).
In the example below, instead of attempting to create a volume called
[projectname]_data, Compose looks for an existing volume simply called
data and mount it into the db service’s containers.
As an example:
version: '2'
services:
db:
image: postgres
volumes:
- data:/var/lib/postgresql/data
volumes:
data:
external: true
You can also specify the name of the volume separately from the name used to refer to it within the Compose file:
volumes:
data:
external:
name: actual-name-of-volume

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string