How to run YugabyteDB with systemd? - yugabytedb

I'm looking for some production config files to run YugabyteDB with systemd. It should be able to specify ulimits and restart the processes on startup/failure.

Sometimes systemd requires for ulimits to be specified in it's config too, which the config files below also include.
yb-master service:
# /etc/systemd/system/yugabyte-master.service
[Unit]
Wants=network-online.target
After=network-online.target
Description=yugabyte-master
[Service]
RestartForceExitStatus=SIGPIPE
EnvironmentFile=/etc/sysconfig/mycompany_env
StartLimitInterval=0
ExecStart=/bin/bash -c '/opt/misc/yugabyte/bin/yb-master \
--fs_data_dirs=/opt/data/1/yugabyte \
--rpc_bind_addresses=n1.node.gce-us-east1.mycompany:7100 \
--server_broadcast_addresses=n1.node.gce-us-east1.mycompany:7100 \
--webserver_interface=n1.node.gce-us-east1.mycompany \
--webserver_port=7000 \
--use_private_ip=never \
--placement_cloud=gce \
--placement_region=gce-us-east1 \
--placement_zone=us-east1-c \
--callhome_collection_level=low \
--logtostderr '
LimitCORE=infinity
TimeoutStartSec=30
WorkingDirectory=/opt/data/1/yugabyte
LimitNOFILE=1048576
LimitNPROC=12000
RestartSec=5
ExecStartPre=/usr/bin/su -c "mkdir -p /opt/data/1/yugabyte && chown yugabyte:yugabyte /opt/data/1/yugabyte"
PermissionsStartOnly=True
User=yugabyte
TimeoutStopSec=300
Restart=always
[Install]
WantedBy=multi-user.target
And one for yb-tserver service:
# /etc/systemd/system/yugabyte-tserver.service
[Unit]
Wants=network-online.target
After=network-online.target
Description=yugabyte-tserver
[Service]
RestartForceExitStatus=SIGPIPE
EnvironmentFile=/etc/sysconfig/mycompany_env
StartLimitInterval=0
ExecStart=/bin/bash -c '/opt/misc/yugabyte/bin/yb-tserver \
--tserver_master_addrs=n1.node.gce-us-east1.mycompany:7100,n2.node.gce-us-central1.mycompany:7100,n3.node.gce-us-west1.mycompany:7100 \
--fs_data_dirs=/opt/data/1/yugabyte \
--rpc_bind_addresses=n1.node.gce-us-east1.mycompany:9200 \
--server_broadcast_addresses=n1.node.gce-us-east1.mycompany:9200 \
--webserver_interface=n1.node.gce-us-east1.mycompany \
--webserver_port=9000 \
--cql_proxy_bind_address=0.0.0.0:9042 \
--use_private_ip=never \
--placement_cloud=gce \
--placement_region=gce-us-east1 \
--placement_zone=us-east1-c \
--start_redis_proxy=false \
--use_cassandra_authentication=true \
--max_stale_read_bound_time_ms=60000 \
--logtostderr --placement_uuid=live'
LimitCORE=infinity
TimeoutStartSec=30
WorkingDirectory=/opt/data/1/yugabyte
LimitNOFILE=1048576
LimitNPROC=12000
RestartSec=5
ExecStartPre=/usr/bin/su -c "mkdir -p /opt/data/1/yugabyte && chown yugabyte:yugabyte /opt/data/1/yugabyte"
PermissionsStartOnly=True
User=yugabyte
TimeoutStopSec=300
Restart=always
[Install]
WantedBy=multi-user.target

Related

podman inside podman: works only with "privileged" while it works without for the official podman image

I am trying to create a podman image that allows me to run rootless podman inside rootless podman.
I have read https://www.redhat.com/sysadmin/podman-inside-container
and tried to build an image analogous to quay.io/podman/stable:latest based on top of docker.io/python:3.10-slim-bullseye or docker.io/ubuntu:22.04,
but somehow my images require --privileged which the quay.io/podman fedora-based image does not.
For reference, here what does work for quay.io/podman/stable:latest:
$ podman run --rm \
--security-opt label=disable \
--device /dev/fuse \
--user podman \
quay.io/podman/stable:latest podman info
prints the podman info and no warning/errors, also podman run hellow-world works inside the container as expected.
I have created a dockerfile for a debian/ubuntu-based image that allows running rootless podman inside. The dockerfile closely follows https://www.redhat.com/sysadmin/podman-inside-container and https://github.com/containers/podman/blob/main/contrib/podmanimage/stable/Containerfile
and is shown at the bottom.
However, the resulting image (call it podinpodtest) does not work as expected:
$ podman run --rm \
--security-opt label=disable \
--device /dev/fuse \
--user podman \
podinpodtest podman info
results in Error: cannot setup namespace using newuidmap: exit status 1.
Adding --privileged makes the image work:
$ podman run --rm \
--security-opt label=disable \
--device /dev/fuse \
--user podman \
--privileged \
podinpodtest podman info
correctly prints the podman info.
Why does the debian/ubuntu based image require --privileged for running rootless podman inside of it?
I do not want to run the image with --privileged – can the debian/ubuntu based image be fixed to work similarly to the quay.io/podman image?
#FROM docker.io/python:3.10-slim-bullseye
FROM docker.io/ubuntu:22.04
RUN apt-get update && apt-get install -y \
containers-storage \
fuse-overlayfs \
libvshadow-utils \
podman \
&& rm -rf /var/lib/apt/lists/*
RUN useradd podman; \
echo "podman:1:999\npodman:1001:64535" > /etc/subuid; \
echo "podman:1:999\npodman:1001:64535" > /etc/subgid;
ARG _REPO_URL="https://raw.githubusercontent.com/containers/podman/main/contrib/podmanimage/stable"
ADD $_REPO_URL/containers.conf /etc/containers/containers.conf
ADD $_REPO_URL/podman-containers.conf /home/podman/.config/containers/containers.conf
RUN mkdir -p /home/podman/.local/share/containers && \
chown podman:podman -R /home/podman && \
chmod 644 /etc/containers/containers.conf
# Copy & modify the defaults to provide reference if runtime changes needed.
# Changes here are required for running with fuse-overlay storage inside container.
RUN sed -e 's|^#mount_program|mount_program|g' \
-e '/additionalimage.*/a "/var/lib/shared",' \
-e 's|^mountopt[[:space:]]*=.*$|mountopt = "nodev,fsync=0"|g' \
/usr/share/containers/storage.conf \
> /etc/containers/storage.conf
# Note VOLUME options must always happen after the chown call above
# RUN commands can not modify existing volumes
VOLUME /var/lib/containers
VOLUME /home/podman/.local/share/containers
RUN mkdir -p /var/lib/shared/overlay-images \
/var/lib/shared/overlay-layers \
/var/lib/shared/vfs-images \
/var/lib/shared/vfs-layers && \
touch /var/lib/shared/overlay-images/images.lock && \
touch /var/lib/shared/overlay-layers/layers.lock && \
touch /var/lib/shared/vfs-images/images.lock && \
touch /var/lib/shared/vfs-layers/layers.lock
ENV _CONTAINERS_USERNS_CONFIGURED=""

how to deploy pyspark/spark image into k8s as running pods to access spark-shell?

I have created pyspark image:
spark 3.3.0
hadoop 3.3.4
I want to deploy it into k8s so I can customize the number of executors, their memory/cpu but I do not want to deploy a spark job. I want to deploy these pods as a purely pyspark image so I can kubectl exec into it and start spark-shell.
How can I achieve it?
I know how to use spark-operator
I know how to deploy spark/pyspark jobs with use of jars/.py+.zip files
but 1 & 2 it is not the case here. Agin, I want purely access spark-shell.
So far, I am doing this but getting errors following this solution:
https://gist.github.com/echang0929/9a9ccf7241f9221b7e59b9ec243e05f5#file-medium-spark-shell-on-k8s-sh
export NS_NAME=dev
export SA_NAME=spark
export CLN_NAME=spark-client
export POD_IMAG=raddeprodacr.azurecr.io/spark-py:s3.3.0
export SVC_NAME=$CLN_NAME-headless
export SVC_PORT=19987
export CLS_ENDP="k8s://http://127.0.0.1:8001"
export EXR_INST=3
export EXR_MORY=7g
export DRV_MORY=7g
kubectl config set-context --current --namespace=$NS_NAME
kubectl create sa $SA_NAME \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create clusterrolebinding ${SA_NAME}-${NS_NAME}-edit \
--clusterrole=edit \
--serviceaccount=$NS_NAME:$SA_NAME \
--namespace=$NS_NAME \
--dry-run=client -o yaml | kubectl apply -f -
# CLN_NAME and POD_IMAG
kubectl run $CLN_NAME \
--image=$POD_IMAG \
--image-pull-policy=Always \
--serviceaccount=$SA_NAME \
--overrides='{"spec": {"nodeSelector": {"agentpool": "small"}}}' \
--dry-run=client -o yaml \
--command=true -- sh -c "exec tail -f /dev/null" | kubectl apply -f -
# SVC_NAME and SVC_PORT
kubectl expose pod $CLN_NAME \
--name=$SVC_NAME \
--type=ClusterIP \
--cluster-ip=None \
--port=$SVC_PORT \
--dry-run=client -o yaml | kubectl apply -f -
### Start spark-shell
kubectl exec -it $CLN_NAME -- sh -c '\
cd /opt/spark/; \
./bin/spark-shell \
--master k8s://"'$CLS_ENDP'" \
--deploy-mode client \
--conf spark.kubernetes.namespace="'$NS_NAME'" \
--conf spark.kubernetes.container.image="'$POD_IMAG'" \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.kubernetes.authenticate.serviceAccountName="'$SA_NAME'" \
--conf spark.kubernetes.driver.pod.name="'$CLN_NAME'" \
--conf spark.executor.instances="'$EXR_INST'" \
--conf spark.executor.memory="'$EXR_MORY'" \
--conf spark.driver.memory="'$DRV_MORY'" \
--conf spark.driver.host="'$SVC_NAME'" \
--conf spark.driver.port="'$SVC_PORT'" \
--conf spark.jars.ivy=/tmp/.ivy'

can't start solana validator node

hello i'm currently trying to start a validator node on a server , following the documentation i made the system file as shown
[Unit]
Description=Solana Validator
After=network.target
Wants=solana-sys-tuner.service
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
RestartSec=1
User=cmfirpc1
LimitNOFILE=1000000
#LogRateLimitIntervalSec=0
Environment="PATH=/bin:/usr/bin:/home/cmfirpc1/.local/share/solana/install/active_release/bin"
ExecStart=/home/cmfirpc1/bin/validator.sh
[Install]
WantedBy=multi-user.target
and created a validator.sh file like shown bellow ,
#!/bin/bash exec solana-validator \
--identity ~/validator-keypair.json \
--vote-account ~/vote-account-keypair.json \
--rpc-port 8899 \
--entrypoint entrypoint.mainnet-beta.solana.com:8001 \
--limit-ledger-size \ --log ~/solana-validator.log
and execute chmod+x on validtor.sh.
however i get the error ,
● sol.service - Solana Validator
Loaded: loaded (/etc/systemd/system/sol.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2021-12-03 23:40:44 UTC; 375ms ago
Process: 263114 ExecStart=/home/cmfirpc1/bin/validator.sh (code=exited, status=203/EXEC)
Main PID: 263114 (code=exited, status=203/EXEC)
It seems you are missing a new line. The # results in the exec-command being interpreted as part of the command.
#!/bin/bash
exec solana-validator \
--identity ~/validator-keypair.json \
--vote-account ~/vote-account-keypair.json \
--rpc-port 8899 \
--entrypoint entrypoint.mainnet-beta.solana.com:8001 \
--limit-ledger-size \ --log ~/solana-validator.log

Systemd ExecStart with arguments

I have process, that i run in this way :
sudo RADIODEV=/dev/spidev0.0 /opt/basicstation/build-rpi-std/bin/station -d -h /opt/basicstation/build-rpi-std/bin
I would like to lunch it on raspberry boot with systemctl like that :
[Unit]
Description=Basic station secure websocket
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
ExecStart= RADIODEV=/dev/spidev0.0 /opt/basicstation/build-rpi-std/bin/station -d -h /opt/basicstation/build-rpi-std/bin
[Install]
WantedBy=multi-user.target
Alias=basic_station.service
So i want to know how put the argument
RADIODEV=/dev/spidev0.0
-d
-h /opt/basicstation/build-rpi-std/bin
because wheni just put :
ExecStart= RADIODEV=/dev/spidev0.0 /opt/basicstation/build-rpi-std/bin/station -d -h /opt/basicstation/build-rpi-std/bin
That's not work
I already check some issue like :
issue systemd
But i can't reproduce what they propose.

Gunicorn around 100% CPU usage

Sometimes my web server just stops responding.
What I found out is that at these moments Gunicorn processes' CPU load is around 100%. I have not changed codebase in a while so I don't think it might cause that.
Here's the bash script I use to run gunicorn:
#!/bin/bash
source /etc/profile.d/myapp.sh
NAME="myapp-web-services"
DJANGODIR="/home/myapp/myapp-web-services"
SOCKFILE=/tmp/myapp-web-services.sock
USER=myapp
GROUP=myapp
NUM_WORKERS=9
TIMEOUT=100
DJANGO_WSGI_MODULE=settings.wsgi
echo "Starting $NAME as `whoami`"
cd $DJANGODIR
source /home/myapp/.virtualenvs/myapp-web-services/bin/activate
export PYTHONPATH=$DJANGODIR:$PYTHONPATH
RUNDIR=$(dirname $SOCKFILE)
test -d $RUNDIR || mkdir -p $RUNDIR
exec newrelic-admin run-program /home/myapp/.virtualenvs/myapp-web-services/bin/gunicorn ${DJANGO_WSGI_MODULE}:application \
--name $NAME \
--workers $NUM_WORKERS \
--user=$USER --group=$GROUP \
--bind=unix:$SOCKFILE \
--log-level=warning \
--timeout=$TIMEOUT \
--log-file=- \
--max-requests=1200
I have 4 CPUs in the system so according to documentation 9 workers should be fine.

Resources