Sharing volumes inside docker container on gitlab runner - gitlab

So, I am trying to mount a working directory with project files into a child instance on a gitlab runner in sort of a DinD setup. I want to be able to mount a volume in a docker instance, which would allow me to muck around and test stuff. Like e2e testing and such… without compiling a new container to inject the files I need… Ideally, so I can share data in a DinD environment without having to build a new container for each job that runs…
I tried following (Docker volumes not mounted when using docker:dind (#41227) · Issues · GitLab.org / GitLab FOSS · GitLab) and I have some directories being mounted, but it is not the project data I am looking for.
So, the test jobs, I created a dummy file, and I wish to mount the directory in a container and view the files…
I have a test ci yml, which sort of does what I am looking for. I make test files in the volume I which to mount, which I would like to see in a directory listing, but sadly do not. I my second attempt at this, I couldn’t get the container ID becuase the labels don’t exist on the runner and it always comes up blank… However, the first stages show promise as It works perfectly on a “shell” runner outside of k8s. But, as soon as I change the tag to use a k8s runner it craps out. I can see old directory files /web and my directory I am mounting, but not the files within it. weird?
ci.yml
image: docker:stable
services:
- docker:dind
stages:
- compile
variables:
SHARED_PATH: /builds/$CI_PROJECT_PATH/shared/
DOCKER_DRIVER: overlay2
.test: &test
stage: compile
tags:
- k8s-vols
script:
- docker version
- 'export TESTED_IMAGE=$(echo ${CI_JOB_NAME} | sed "s/test //")'
- docker pull ${TESTED_IMAGE}
- 'export SHARED_PATH="$(dirname ${CI_PROJECT_DIR})/shared"'
- echo ${SHARED_PATH}
- echo ${CI_PROJECT_DIR}
- mkdir -p ${SHARED_PATH}
- touch ${SHARED_PATH}/test_file
- touch ${CI_PROJECT_DIR}/test_file2
- find ${SHARED_PATH}
#- find ${CI_PROJECT_DIR}
- docker run --rm -v ${CI_PROJECT_DIR}:/mnt ${TESTED_IMAGE} find /mnt
- docker run --rm -v ${CI_PROJECT_DIR}:/mnt ${TESTED_IMAGE} ls -lR /mnt
- docker run --rm -v ${SHARED_PATH}:/mnt ${TESTED_IMAGE} find /mnt
- docker run --rm -v ${SHARED_PATH}:/mnt ${TESTED_IMAGE} ls -lR /mnt
test alpine: *test
test ubuntu: *test
test centos: *test
testing:
stage: compile
tags:
- k8s-vols
image:
name: docker:stable
entrypoint: ["/bin/sh", "-c"]
script:
# get id of container
- export CONTAINER_ID=$(docker ps -q -f "label=com.gitlab.gitlab-runner.job.id=$CI_JOB_ID" -f "label=com.gitlab.gitlab-runner.type=build")
# get mount name
- export MOUNT_NAME=$(docker inspect $CONTAINER_ID -f "{{ range .Mounts }}{{ if eq .Destination \"/builds/${CI_PROJECT_NAMESPACE}\" }}{{ .Source }}{{end}}{{end}}" | cut -d "/" -f 6)
# run container
- docker run -v $MOUNT_NAME:/builds -w /builds/$CI_PROJECT_NAME --entrypoint=/bin/sh busybox -c "ls -la"
This is the values files I am working with…
image: docker-registry.corp.com/base-images/gitlab-runner:alpine-v13.3.1
imagePullPolicy: IfNotPresent
gitlabUrl: http://gitlab.corp.com
runnerRegistrationToken: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
runnerToken: ""
unregisterRunners: true
terminationGracePeriodSeconds: 3600
concurrent: 5
checkInterval: 10
rbac:
create: true
resources: ["pods", "pods/exec", "secrets"]
verbs: ["get", "list", "watch","update", "create", "delete"]
clusterWideAccess: false
metrics:
enabled: true
runners:
image: docker-registry.corp.com/base-images/docker-dind:v1
imagePullPolicy: "if-not-present"
requestConcurrency: 5
locked: true
tags: "k8s-vols"
privileged: true
secret: gitlab-runner-vols
namespace: gitlab-runner-k8s-vols
pollTimeout: 180
outputLimit: 4096
kubernetes:
volumes:
- type: host_path
volume:
name: docker
host_path: /var/run/docker.sock
mount_path: /var/run/docker.sock
read_only: false
cache: {}
builds: {}
services: {}
helpers:
cpuLimit: 200m
memoryLimit: 256Mi
cpuRequests: 100m
memoryRequests: 128Mi
image: docker-registry.corp.com/base-images/gitlab-runner-helper:x86_64-latest
env:
NAME: VALUE
CI_SERVER_URL: http://gitlab.corp.com
CLONE_URL:
RUNNER_REQUEST_CONCURRENCY: '1'
RUNNER_EXECUTOR: kubernetes
REGISTER_LOCKED: 'true'
RUNNER_TAG_LIST: k8s-vols
RUNNER_OUTPUT_LIMIT: '4096'
KUBERNETES_IMAGE: ubuntu:18.04
KUBERNETES_PRIVILEGED: 'true'
KUBERNETES_NAMESPACE: gitlab-runners-k8s-vols
KUBERNETES_POLL_TIMEOUT: '180'
KUBERNETES_CPU_LIMIT:
KUBERNETES_MEMORY_LIMIT:
KUBERNETES_CPU_REQUEST:
KUBERNETES_MEMORY_REQUEST:
KUBERNETES_SERVICE_ACCOUNT:
KUBERNETES_SERVICE_CPU_LIMIT:
KUBERNETES_SERVICE_MEMORY_LIMIT:
KUBERNETES_SERVICE_CPU_REQUEST:
KUBERNETES_SERVICE_MEMORY_REQUEST:
KUBERNETES_HELPER_CPU_LIMIT:
KUBERNETES_HELPER_MEMORY_LIMIT:
KUBERNETES_HELPER_CPU_REQUEST:
KUBERNETES_HELPER_MEMORY_REQUEST:
KUBERNETES_HELPER_IMAGE:
KUBERNETES_PULL_POLICY:
securityContext:
fsGroup: 65533
runAsUser: 100
resources: {}
affinity: {}
nodeSelector: {}
tolerations: []
envVars:
- name: CI_SERVER_URL
value: http://gitlab.corp.com
- name: CLONE_URL
- name: RUNNER_REQUEST_CONCURRENCY
value: '1'
- name: RUNNER_EXECUTOR
value: kubernetes
- name: REGISTER_LOCKED
value: 'true'
- name: RUNNER_TAG_LIST
value: k8s-vols
- name: RUNNER_OUTPUT_LIMIT
value: '4096'
- name: KUBERNETES_IMAGE
value: ubuntu:18.04
- name: KUBERNETES_PRIVILEGED
value: 'true'
- name: KUBERNETES_NAMESPACE
value: gitlab-runner-k8s-vols
- name: KUBERNETES_POLL_TIMEOUT
value: '180'
- name: KUBERNETES_CPU_LIMIT
- name: KUBERNETES_MEMORY_LIMIT
- name: KUBERNETES_CPU_REQUEST
- name: KUBERNETES_MEMORY_REQUEST
- name: KUBERNETES_SERVICE_ACCOUNT
- name: KUBERNETES_SERVICE_CPU_LIMIT
- name: KUBERNETES_SERVICE_MEMORY_LIMIT
- name: KUBERNETES_SERVICE_CPU_REQUEST
- name: KUBERNETES_SERVICE_MEMORY_REQUEST
- name: KUBERNETES_HELPER_CPU_LIMIT
- name: KUBERNETES_HELPER_MEMORY_LIMIT
- name: KUBERNETES_HELPER_CPU_REQUEST
- name: KUBERNETES_HELPER_MEMORY_REQUEST
- name: KUBERNETES_HELPER_IMAGE
- name: KUBERNETES_PULL_POLICY
hostAliases:
- ip: "10.10.x.x"
hostnames:
- "ch01"
podAnnotations:
prometheus.io/path: "/metrics"
prometheus.io/scrape: "true"
prometheus.io/port: "9252"
podLabels: {}
So, I have made a couple of tweaks to the helm chart. I have added a a volumes section in the config map…
config.toml: |
concurrent = {{ .Values.concurrent }}
check_interval = {{ .Values.checkInterval }}
log_level = {{ default “info” .Values.logLevel | quote }}
{{- if .Values.metrics.enabled }}
listen_address = ‘[::]:9252’
{{- end }}
volumes = ["/builds:/builds"]
#volumes = ["/var/run/docker.sock:/var/run/docker.sock", “/cache”, “/builds:/builds”]
I tried using the last line, which includes the docker sock mount, but when it ran, it complained that it could no find mount docker.sock, file not found, so I used the builds directory only in this section, and in the values files, added, the docker.sock mount. and it seems to work fine. for everything else but this mounting thing…
I also saw examples of setting the runner to privileged, but that didn’t seem to do much for me…
when I run the pipeline, this is the output…
So as you can see no files…
Thanks for taking the time to be thorough in your request, it really helps!

Related

crunchy postgres operator backup to azure blob fails

I want to backup the database into azure blob, but failed.
My configuration is as follows
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-azure
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.6-2
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 32Gi
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.41-2
configuration:
- secret:
name: pgo-azure-creds
global:
repo2-path: /pgbackrest/repo2
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 32Gi
- name: repo2
azure:
container: "pgo"
patroni:
dynamicConfiguration:
postgresql:
pg_hba:
- "host all all 0.0.0.0/0 trust"
- "host all postgres 127.0.0.1/32 md5"
users:
- name: qixin
databases:
- iot
- lowcode
options: "SUPERUSER"
service:
type: LoadBalancer
My storage account name is pgobackup, container name is pgo.
The content of azure.conf is as follows.
[global]
repo2-azure-account=pgobackup
repo2-azure-key=aXdafScEP28el......JpkYa28nh5V+AStNtZ5Lg==
But there are no files on the blob
Also I executed the following command for a one time backup
kubectl annotate -n postgres-operator postgrescluster hippo-azure postgres-operator.crunchydata.com/pgbackrest-backup="$( date '+%F_%H:%M:%S' )" --overwrite=true
I also tried scheduled backups, but that also failed
- name: repo2
schedules:
full: "18 * * * *"
differential: "0 1 * * 1-6"
azure:
container: "pgo"
Taken from the official documentation.
The official document says that cronjob will be created, but no Job or Cronjob is created.
Can anyone give me some advice?
Thx!

Files in AIRFLOW_HOME (which is an Azure File Share MOUNT) are created as root

I have set up the airflow in Azure Cloud (Azure Container Apps) and attached an Azure File Share as an external mount/volume
1. I ran **airflow init service**, it had created the airflow.cfg and `'webserver_config.py'` file in the **AIRFLOW_HOME (/opt/airflow)**, which is actually an azure mounted file system
2. I ran **airflow webserver service**, it had created the `airflow-webserver.pid` file in the **AIRFLOW_HOME (/opt/airflow)**, which is actually an azure mounted file system
Now the problem is all the files created above are created with root user&groups, not as airflow user(50000),
I have also set the env variable AIRFLOW_UID to 50000 during the creation of the container app. due to this my webservers are not starting, throwing the below error
PermissionError: [Errno 1] Operation not permitted: '/opt/airflow/airflow-webserver.pid'
Note: Azure Containers Apps does not allow use root/sudo commands, otherwise I could solve this problem with simple chown commands
Another problem is the airflow configurations passed through environment variables are never picked up by Docker, eg
- name: AIRFLOW__API__AUTH_BACKENDS
value: 'airflow.api.auth.backend.basic_auth'
Attached screenshot for reference
Your help is much appreciated!
YAML file that I use to create my container app:
id: /subscriptions/1234/resourceGroups/<my-res-group>/providers/Microsoft.App/containerApps/<app-name>
identity:
type: None
location: eastus2
name: webservice
properties:
configuration:
activeRevisionsMode: Single
registries: []
managedEnvironmentId: /subscriptions/1234/resourceGroups/<my-res-group>/providers/Microsoft.App/managedEnvironments/container-app-env
template:
containers:
- command:
- /bin/bash
- -c
- exec /entrypoint airflow webserver
env:
- name: AIRFLOW__API__AUTH_BACKENDS
value: 'airflow.api.auth.backend.basic_auth'
- name: AIRFLOW__CELERY__BROKER_URL
value: redis://:#myredis.redis.cache.windows.net:6379/0
- name: AIRFLOW__CELERY__RESULT_BACKEND
value: db+postgresql://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
value: 'true'
- name: AIRFLOW__CORE__EXECUTOR
value: CeleryExecutor
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
value: postgresql+psycopg2://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
value: postgresql+psycopg2://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__CORE__LOAD_EXAMPLES
value: 'false'
- name: AIRFLOW_UID
value: 50000
image: docker.io/apache/airflow:latest
name: wsr
volumeMounts:
- volumeName: randaf-azure-files-volume
mountPath: /opt/airflow
probes: []
resources:
cpu: 0.25
memory: 0.5Gi
scale:
maxReplicas: 3
minReplicas: 1
volumes:
- name: randaf-azure-files-volume
storageName: randafstorage
storageType: AzureFile
resourceGroup: RAND
tags:
tagname: ws-only
type: Microsoft.App/containerApps

gitlab-ci docker-in-docker with unsecure registry

I currently try to deploy an image build on a gitlab ci/cd to registry.
runner and gitlab containers are setup into the gitlab network.
Therefore, I couldn't make it work.
Here are my configs :
# gitlab-runner.toml
concurrent = 1
check_interval = 0
[[runners]]
name = "runner1"
url = "http://gitlab"
token = "t4ihZ8Tc4Kxy5i5EgHYt"
executor = "docker"
[runners.docker]
host = ""
tls_verify = false
image = "ruby:2.1"
privileged = false
disable_cache = false
shm_size = 0
network_mode = "gitlab"
# gitlab.rb
external_url 'https://gitlab.domain.com/'
gitlab_rails['initial_root_password'] = File.read('/run/secrets/gitlab_root_password')
nginx['listen_https'] = false
nginx['listen_port'] = 80
nginx['redirect_http_to_https'] = false
letsencrypt['enable'] = false
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] ="smtp.domain.com"
gitlab_rails['smtp_port'] = 587
gitlab_rails['smtp_user_name'] = "gitlab"
gitlab_rails['smtp_password'] = "password"
gitlab_rails['smtp_domain'] = "domain.com"
gitlab_rails['smtp_authentication'] = "login"
gitlab_rails['smtp_enable_starttls_auto'] = true
gitlab_rails['smtp_openssl_verify_mode'] = 'peer'
gitlab_rails['gitlab_email_from'] = 'gitlab#domain.com'
gitlab_rails['gitlab_email_reply_to'] = 'noreply#domain.com'
# gitlab-compose.yml
version: "3.6"
services:
gitlab:
image: gitlab/gitlab-ce:latest
volumes:
- gitlab_data:/var/opt/gitlab
- gitlab_logs:/var/log/gitlab
- gitlab_config:/etc/gitlab
shm_size: '256m'
environment:
GITLAB_OMNIBUS_CONFIG: "from_file('/omnibus_config.rb')"
GITLAB_ROOT_EMAIL: "contact#domain.com"
GITLAB_ROOT_PASSWORD: "password"
configs:
- source: gitlab
target: /omnibus_config.rb
secrets:
- gitlab_root_password
deploy:
placement:
constraints:
- node.labels.role == compute
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik-public"
- traefik.constraint-label=traefik-public
- "traefik.http.services.gitlab.loadbalancer.server.port=80"
- "traefik.http.routers.gitlab.rule=Host(`gitlab.domain.com`)"
- "traefik.http.routers.gitlab.entrypoints=websecure"
- "traefik.http.routers.gitlab.tls.certresolver=lets-encrypt"
networks:
- gitlab
- traefik-public
configs:
gitlab:
file: ./gitlab.rb
secrets:
gitlab_root_password:
file: ./root_password.txt
volumes:
gitlab_data:
driver: local
gitlab_logs:
driver: local
gitlab_config:
driver: local
networks:
gitlab:
external: true
traefik-public:
external: true
#gitlab-ci
stages:
- gulp_build
- docker_build_deploy
cache:
paths:
- node_modules/
variables:
DEPLOY_USER: $DEPLOY_USER
DEPLOY_TOKEN: $DEPLOY_TOKEN
build app:
stage: gulp_build
image: node:14.17
before_script:
- npm install
script:
- ./node_modules/.bin/gulp build -production
artifacts:
paths:
- public
docker deploy:
stage: docker_build_deploy
image: docker:latest
services:
- name: docker:dind
command: ["--insecure-registry=gitlab"]
before_script:
- docker login -u $DEPLOY_USER -p $DEPLOY_TOKEN gitlab
script:
- echo $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
- docker build -t gitlab/laurene/domain.com:$CI_COMMIT_REF_SLUG -t gitlab/laurene/domain.com:latest .
- docker push gitlab/laurene/domain.com:$CI_COMMIT_REF_SLUG
- docker push gitlab/laurene/domain.com:latest
Deployment logs :
[0KRunning with gitlab-runner 14.9.1 (bd40e3da)[0;m
[0K on runner1 t4ihZ8Tc[0;m
section_start:1650242087:prepare_executor
[0K[0K[36;1mPreparing the "docker" executor[0;m[0;m
[0KUsing Docker executor with image docker:latest ...[0;m
[0KStarting service docker:dind ...[0;m
[0KPulling docker image docker:dind ...[0;m
[0KUsing docker image sha256:a072474332af3e4cf06e349685c4cea8f9e631f0c5cab5b582f3a3ab4cff9b6a for docker:dind with digest docker#sha256:210076c7772f47831afaf7ff200cf431c6cd191f0d0cb0805b1d9a996e99fb5e ...[0;m
[0KWaiting for services to be up and running...[0;m
[0;33m*** WARNING:[0;m Service runner-t4ihz8tc-project-2-concurrent-0-2cd68d823b0d9914-docker-0 probably didn't start properly.
Health check error:
service "runner-t4ihz8tc-project-2-concurrent-0-2cd68d823b0d9914-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2022-04-18T00:34:50.436194142Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-04-18T00:34:50.490663718Z ............++++
2022-04-18T00:34:50.549517108Z ...............++++
2022-04-18T00:34:50.549802329Z e is 65537 (0x010001)
2022-04-18T00:34:50.562099799Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-04-18T00:34:50.965975282Z .....................................................................................................................++++
2022-04-18T00:34:51.033998142Z ..................++++
2022-04-18T00:34:51.034281623Z e is 65537 (0x010001)
2022-04-18T00:34:51.056355164Z Signature ok
2022-04-18T00:34:51.056369034Z subject=CN = docker:dind server
2022-04-18T00:34:51.056460584Z Getting CA Private Key
2022-04-18T00:34:51.065394153Z /certs/server/cert.pem: OK
2022-04-18T00:34:51.067347859Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-04-18T00:34:51.210090561Z ........................................++++
2022-04-18T00:34:51.491331619Z .................................................................................++++
2022-04-18T00:34:51.491620790Z e is 65537 (0x010001)
2022-04-18T00:34:51.509644008Z Signature ok
2022-04-18T00:34:51.509666918Z subject=CN = docker:dind client
2022-04-18T00:34:51.509757628Z Getting CA Private Key
2022-04-18T00:34:51.519103998Z /certs/client/cert.pem: OK
2022-04-18T00:34:51.594873133Z ip: can't find device 'ip_tables'
2022-04-18T00:34:51.595519686Z ip_tables 32768 3 iptable_mangle,iptable_filter,iptable_nat
2022-04-18T00:34:51.595526296Z x_tables 40960 14 xt_REDIRECT,xt_ipvs,xt_state,xt_policy,iptable_mangle,xt_mark,xt_u32,xt_nat,xt_tcpudp,xt_conntrack,xt_MASQUERADE,xt_addrtype,iptable_filter,ip_tables
2022-04-18T00:34:51.595866717Z modprobe: can't change directory to '/lib/modules': No such file or directory
2022-04-18T00:34:51.597027030Z mount: permission denied (are you root?)
2022-04-18T00:34:51.597064490Z Could not mount /sys/kernel/security.
2022-04-18T00:34:51.597067880Z AppArmor detection and --privileged mode might break.
2022-04-18T00:34:51.597608422Z mount: permission denied (are you root?)
[0;33m*********[0;m
[0KPulling docker image docker:latest ...[0;m
[0KUsing docker image sha256:7417809fdb730b60c1b903077030aacc708677cdf02f2416ce413f38e81ec7e0 for docker:latest with digest docker#sha256:41978d1974f05f80e1aef23ac03040491a7e28bd4551d4b469b43e558341864e ...[0;m
section_end:1650242124:prepare_executor
[0Ksection_start:1650242124:prepare_script
[0K[0K[36;1mPreparing environment[0;m[0;m
Running on runner-t4ihz8tc-project-2-concurrent-0 via fed5cebcc8e6...
section_end:1650242125:prepare_script
[0Ksection_start:1650242125:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m[0;m
[32;1mFetching changes with git depth set to 20...[0;m
Reinitialized existing Git repository in /builds/laurene/oelabs.co/.git/
[32;1mChecking out a63a1f2a as master...[0;m
Removing node_modules/
[32;1mSkipping Git submodules setup[0;m
section_end:1650242128:get_sources
[0Ksection_start:1650242128:restore_cache
[0K[0K[36;1mRestoring cache[0;m[0;m
[32;1mChecking cache for default...[0;m
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.[0;m
[32;1mSuccessfully extracted cache[0;m
section_end:1650242129:restore_cache
[0Ksection_start:1650242129:download_artifacts
[0K[0K[36;1mDownloading artifacts[0;m[0;m
[32;1mDownloading artifacts for build app (97)...[0;m
Downloading artifacts from coordinator... ok [0;m id[0;m=97 responseStatus[0;m=200 OK token[0;m=Uvp--J3i
section_end:1650242131:download_artifacts
[0Ksection_start:1650242131:step_script
[0K[0K[36;1mExecuting "step_script" stage of the job script[0;m[0;m
[0KUsing docker image sha256:7417809fdb730b60c1b903077030aacc708677cdf02f2416ce413f38e81ec7e0 for docker:latest with digest docker#sha256:41978d1974f05f80e1aef23ac03040491a7e28bd4551d4b469b43e558341864e ...[0;m
[32;1m$ docker login -u $DEPLOY_USER -p $DEPLOY_TOKEN gitlab[0;m
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
time="2022-04-18T00:35:33Z" level=info msg="Error logging in to endpoint, trying next endpoint" error="Get \"https://gitlab/v2/\": dial tcp 10.0.18.4:443: connect: connection refused"
Get "https://gitlab/v2/": dial tcp 10.0.18.4:443: connect: connection refused
section_end:1650242133:step_script
[0K[31;1mERROR: Job failed: exit code 1
[0;m

Elastic Search upgrade to v8 on Kubernetes

I am having an elastic search deployment on a Microsoft Kubernetes cluster that was deployed with a 7.x chart and I changed the image to 8.x. This upgrade worked and both elastic and Kibana was accessible, but now i need to enable THE new security feature which is included in the basic license from now on. The reason behind the security first came from the requirement to enable APM Server/Agents.
I have the following values:
- name: cluster.initial_master_nodes
value: elasticsearch-master-0,
- name: discovery.seed_hosts
value: elasticsearch-master-headless
- name: cluster.name
value: elasticsearch
- name: network.host
value: 0.0.0.0
- name: cluster.deprecation_indexing.enabled
value: 'false'
- name: node.roles
value: data,ingest,master,ml,remote_cluster_client
The elastic search and kibana pods are able to start but i am unable to set APM Integration due security. So I am enabling security using the below values:
- name: xpack.security.enabled
value: 'true'
Then i am getting an error log from the elasic search pod: "Transport SSL must be enabled if security is enabled. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]". So i am enabling ssl using the below values:
- name: xpack.security.transport.ssl.enabled
value: 'true'
Then i am getting an error log from elastic search pod: "invalid SSL configuration for xpack.security.transport.ssl - server ssl configuration requires a key and certificate, but these have not been configured; you must set either [xpack.security.transport.ssl.keystore.path] (p12 file), or both [xpack.security.transport.ssl.key] (pem file) and [xpack.security.transport.ssl.certificate] (pem key file)".
I start with Option1, i am creating the keys using the below commands (no password / enter, enter / enter, enter, enter) and i am coping them to a persistent folder:
./bin/elasticsearch-certutil ca
./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12
cp elastic-stack-ca.p12 data/elastic-stack-ca.p12
cp elastic-certificates.p12 data/elastic-certificates.p12
In addition I am also configuring the below values:
- name: xpack.security.transport.ssl.truststore.path
value: '/usr/share/elasticsearch/data/elastic-certificates.p12'
- name: xpack.security.transport.ssl.keystore.path
value: '/usr/share/elasticsearch/data/elastic-certificates.p12'
But the pod is still in initializing, if generate the certificates with password. then i am getting an error log from elastic search pod: "cannot read configured [PKCS12] keystore (as a truststore) [/usr/share/elasticsearch/data/elastic-certificates.p12] - this is usually caused by an incorrect password; (no password was provided)"
Then i go to Option2, i am creating the keys using the below commands and i am coping them to a persistent folder
./bin/elasticsearch-certutil ca --pem
unzip elastic-stack-ca.zip –d
cp ca.crt data/ca.crt
cp ca.key data/ca.key
In addition I am also configuring the below values:
- name: xpack.security.transport.ssl.key
value: '/usr/share/elasticsearch/data/ca.key'
- name: xpack.security.transport.ssl.certificate
value: '/usr/share/elasticsearch/data/ca.crt'
But the pod is still in initializing state without providing any logs, as i know while pod is in initializing state it does not produce any container logs. From portal side in events everything seems to be ok, except the elastic pod which is not in ready state.
At last i located the same issue to the eleastic search community, without any response: https://discuss.elastic.co/t/elasticsearch-pods-are-not-ready-when-xpack-security-enabled-is-configured/281709?u=s19k15
Here is my StatefullSet
status:
observedGeneration: 169
replicas: 1
updatedReplicas: 1
currentRevision: elasticsearch-master-7449d7bd69
updateRevision: elasticsearch-master-7d8c7b6997
collisionCount: 0
spec:
replicas: 1
selector:
matchLabels:
app: elasticsearch-master
template:
metadata:
name: elasticsearch-master
creationTimestamp: null
labels:
app: elasticsearch-master
chart: elasticsearch
release: platform
spec:
initContainers:
- name: configure-sysctl
image: docker.elastic.co/elasticsearch/elasticsearch:8.1.2
command:
- sysctl
- '-w'
- vm.max_map_count=262144
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
runAsUser: 0
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.1.2
ports:
- name: http
containerPort: 9200
protocol: TCP
- name: transport
containerPort: 9300
protocol: TCP
env:
- name: node.name
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: elasticsearch-master-0,
- name: discovery.seed_hosts
value: elasticsearch-master-headless
- name: cluster.name
value: elasticsearch
- name: cluster.deprecation_indexing.enabled
value: 'false'
- name: ES_JAVA_OPTS
value: '-Xmx512m -Xms512m'
- name: node.roles
value: data,ingest,master,ml,remote_cluster_client
- name: xpack.license.self_generated.type
value: basic
- name: xpack.security.enabled
value: 'true'
- name: xpack.security.transport.ssl.enabled
value: 'true'
- name: xpack.security.transport.ssl.truststore.path
value: /usr/share/elasticsearch/data/elastic-certificates.p12
- name: xpack.security.transport.ssl.keystore.path
value: /usr/share/elasticsearch/data/elastic-certificates.p12
- name: xpack.security.http.ssl.enabled
value: 'true'
- name: xpack.security.http.ssl.truststore.path
value: /usr/share/elasticsearch/data/elastic-certificates.p12
- name: xpack.security.http.ssl.keystore.path
value: /usr/share/elasticsearch/data/elastic-certificates.p12
- name: logger.org.elasticsearch.discovery
value: debug
- name: path.logs
value: /usr/share/elasticsearch/data
- name: xpack.security.enrollment.enabled
value: 'true'
resources:
limits:
cpu: '1'
memory: 2Gi
requests:
cpu: 100m
memory: 512Mi
volumeMounts:
- name: elasticsearch-master
mountPath: /usr/share/elasticsearch/data
readinessProbe:
exec:
command:
- bash
- '-c'
- >
set -e
# If the node is starting up wait for the cluster to be ready
(request params: "wait_for_status=green&timeout=1s" )
# Once it has started only check that the node itself is
responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling
curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$#" $args
fi
if [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$#" -u "elastic:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$#" "http://127.0.0.1:9200${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
initialDelaySeconds: 10
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 3
failureThreshold: 3
lifecycle:
postStart:
exec:
command:
- bash
- '-c'
- >
#!/bin/bash
# Create the
dev.general.logcreation.elasticsearchlogobject.v1.json index
ES_URL=http://localhost:9200
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n'
$ES_URL)" != "200" ]]; do sleep 1; done
curl --request PUT --header 'Content-Type: application/json'
"$ES_URL/dev.general.logcreation.elasticsearchlogobject.v1.json/"
--data
'{"mappings":{"properties":{"Properties":{"properties":{"StatusCode":{"type":"text"}}}}},"settings":{"index":{"number_of_shards":"1","number_of_replicas":"0"}}}'
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
drop:
- ALL
runAsUser: 1000
runAsNonRoot: true
restartPolicy: Always
terminationGracePeriodSeconds: 120
dnsPolicy: ClusterFirst
automountServiceAccountToken: true
securityContext:
runAsUser: 1000
fsGroup: 1000
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- elasticsearch-master
topologyKey: kubernetes.io/hostname
schedulerName: default-scheduler
enableServiceLinks: true
volumeClaimTemplates:
- kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: elasticsearch-master
creationTimestamp: null
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
volumeMode: Filesystem
status:
phase: Pending
serviceName: elasticsearch-master-headless
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
revisionHistoryLimit: 10
Any ideas?
Finally found the answer, maybe it helps lot of people in case they face something similar. When the pod is initializing endlessly is like sleeping. In my case a strange code inside my chart StatefullSet started causing this issue when security became enabled.
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n'
$ES_URL)" != "200" ]]; do sleep 1; done
This will not return 200 as now the http excepts also a user and a password to authenticate and therefore is goes for a sleep.
So make sure that in case the pods are in initializing state and remaining there, there is no any while/sleep

Create a file in GitHub action

Inside Github Action I'm using Anchore+grype to scan a container image, using the job below:
name: "CI"
on:
push:
pull_request:
branches:
- main
jobs:
image-analysis:
name: Analyze image
runs-on: ubuntu-18.04
needs: build
steps:
- name: Scan operator image
uses: anchore/scan-action#v3
id: scan
with:
image: "qserv/qserv-operator:2022.1.1-rc1"
acs-report-enable: true
In order to ignore a false-positive during image scan, I want to create the file $HOME/.grype.yaml (see content below) before launching the image scan:
ignore:
# False positive, see https://github.com/anchore/grype/issues/558
- vulnerability: CVE-2015-5237
fix-state: unknown
package:
name: google.golang.org/protobuf
version: v1.26.0
type: go-module
location: "/manager"
Could you please show me how to create this file inside Github Action?
you could do something as simple as creating the file and then writing to it like this:
- name: Create grype.yaml
run: |
touch grype.yaml
echo "
ignore:
# False positive, see https://github.com/anchore/grype/issues/558
- vulnerability: CVE-2015-5237
fix-state: unknown
package:
name: google.golang.org/protobuf
version: v1.26.0
type: go-module
location: "/manager"" > ~/grype.yaml
This one works and has been tested successfully on Github Actions:
name: "CI"
on:
push:
pull_request:
branches:
- main
jobs:
image-analysis:
name: Analyze image
runs-on: ubuntu-18.04
permissions:
security-events: write
needs: build
steps:
- name: Create grype configuration
run: |
cat <<EOF > $HOME/.grype.yaml
ignore:
# False positive, see https://github.com/anchore/grype/issues/558
- vulnerability: CVE-2015-5237
fix-state: unknown
package:
name: google.golang.org/protobuf
version: v1.26.0
type: go-module
location: "/manager"
EOF
- name: Scan operator image
uses: anchore/scan-action#v3
id: scan
with:
image: ""qserv/qserv-operator:2022.1.1-rc1""
acs-report-enable: true
fail-build: false

Resources