Unable to take cluster backups using medusa - cassandra

I have a simple 3 node cluster with some sample data. I want to take backup (To S3) of the cluster and restore it in different cluster.
I can take node backup and restore however cluster level backup never seems to be working. It is throwing the following error.
Cassandra version - 3.10, OS - Amazon linux 2 - centos
[2022-05-20 06:37:20,534] INFO: Creating snapshots on all nodes
[2022-05-20 06:37:20,534] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy snapshot -t medusa-samp" on following nodes ['nodecassandra-2.localdomain', 'nodecassandra-3.localdomain', 'nodecassandra-1.localdomain'] with a parallelism/pool size of 500
[2022-05-20 06:37:21,129] ERROR: Job executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy snapshot -t medusa-samp" ran and finished with errors on following nodes: ['nodecassandra-1.localdomain', 'nodecassandra-2.localdomain', 'nodecassandra-3.localdomain']
[2022-05-20 06:37:21,130] INFO: [nodecassandra-2.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] INFO: nodecassandra-2.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] INFO: [nodecassandra-3.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] INFO: nodecassandra-3.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] INFO: [nodecassandra-1.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] INFO: nodecassandra-1.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:21,130] ERROR: Some nodes failed to create the snapshot.
[2022-05-20 06:37:21,130] ERROR: This error happened during the cluster backup: Some nodes failed to create the snapshot.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/medusa/backup_cluster.py", line 64, in orchestrate
backup.execute(cql_session_provider)
File "/usr/local/lib/python3.7/site-packages/medusa/backup_cluster.py", line 146, in execute
self._create_snapshots()
File "/usr/local/lib/python3.7/site-packages/medusa/backup_cluster.py", line 163, in _create_snapshots
raise Exception(err_msg)
Exception: Some nodes failed to create the snapshot.
[2022-05-20 06:37:21,131] ERROR: Something went wrong! Attempting to clean snapshots and exit.
[2022-05-20 06:37:21,131] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy clearsnapshot -t medusa-samp" on following nodes ['nodecassandra-2.localdomain', 'nodecassandra-3.localdomain', 'nodecassandra-1.localdomain'] with a parallelism/pool size of 1
[2022-05-20 06:37:22,290] ERROR: Job executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy clearsnapshot -t medusa-samp" ran and finished with errors on following nodes: ['nodecassandra-1.localdomain', 'nodecassandra-2.localdomain', 'nodecassandra-3.localdomain']
[2022-05-20 06:37:22,290] INFO: [nodecassandra-2.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,290] INFO: nodecassandra-2.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,290] INFO: [nodecassandra-3.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,290] INFO: nodecassandra-3.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,290] INFO: [nodecassandra-1.localdomain] [err] /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,290] INFO: nodecassandra-1.localdomain-stderr: /bin/bash: nodetool: command not found
[2022-05-20 06:37:22,291] ERROR: Some nodes failed to clear the snapshot. Cleaning snapshots manually is recommended
I have added path variables for cassandra bin, nodetool.
Added CASSANDRA_HOME, CASSANDRA_CONF in /etc/profile.d/cassandra.sh.
Added ssh details in medusa.ini. Provided cassandra_conf file path,
storage (s3) details and ssh details in medusa.ini.
Am i missing anything else ?

Related

lubuntu / centos container CrashLoopBackOff error

when ever i run # kubectl run ubuntu --image=ubuntu or centos
i gt containercrashoff , when checked in kubectl describe pod below error is observed
Warning Failed 4s (x3 over 22s) kubelet Error: failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "ping": executable file not found in $PATH: unknown
pl suggest to solve this issue

Timed out waiting for a YugaByte DB cluster

ubuntu#vps-9b30a7d3:~/Database/yugabyte-2.1.8.1$ ./bin/yb-ctl create
Creating cluster.
Waiting for cluster to be ready.
Viewing file /home/ubuntu/yugabyte-data/node-1/disk-1/tserver.err:
sh: 1: /home/ubuntu/Database/yugabyte-2.1.8.1/bin/yb-tserver: not found
Viewing file /home/ubuntu/yugabyte-data/node-1/disk-1/master.err:
sh: 1: /home/ubuntu/Database/yugabyte-2.1.8.1/bin/yb-master: not found
Traceback (most recent call last):
File "./bin/yb-ctl", line 2021, in <module>
control.run()
File "./bin/yb-ctl", line 1998, in run
self.args.func()
File "./bin/yb-ctl", line 1755, in create_cmd_impl
self.wait_for_cluster_or_raise()
File "./bin/yb-ctl", line 1598, in wait_for_cluster_or_raise
raise RuntimeError("Timed out waiting for a YugaByte DB cluster!")
RuntimeError: Timed out waiting for a YugaByte DB cluster!
Viewing file /tmp/tmpfY6csf:
2020-08-02 10:15:38,864 INFO: Starting master-1 with:
/home/ubuntu/Database/yugabyte-2.1.8.1/bin/yb-master --fs_data_dirs "/home/ubuntu/yugabyte-data/node-1/disk-1" --webserver_interface 127.0.0.1 --rpc_bind_addresses 127.0.0.1 --v 0 --version_file_json_path=/home/ubuntu/Database/yugabyte-2.1.8.1 --webserver_doc_root "/home/ubuntu/Database/yugabyte-2.1.8.1/www" --replication_factor=1 --yb_num_shards_per_tserver 2 --ysql_num_shards_per_tserver=2 --default_memory_limit_to_ram_ratio=0.35 --master_addresses 127.0.0.1:7100 --enable_ysql=true >"/home/ubuntu/yugabyte-data/node-1/disk-1/master.out" 2>"/home/ubuntu/yugabyte-data/node-1/disk-1/master.err" &
2020-08-02 10:15:38,871 INFO: Starting tserver-1 with:
/home/ubuntu/Database/yugabyte-2.1.8.1/bin/yb-tserver --fs_data_dirs "/home/ubuntu/yugabyte-data/node-1/disk-1" --webserver_interface 127.0.0.1 --rpc_bind_addresses 127.0.0.1 --v 0 --version_file_json_path=/home/ubuntu/Database/yugabyte-2.1.8.1 --webserver_doc_root "/home/ubuntu/Database/yugabyte-2.1.8.1/www" --tserver_master_addrs=127.0.0.1:7100 --yb_num_shards_per_tserver=2 --redis_proxy_bind_address=127.0.0.1:6379 --cql_proxy_bind_address=127.0.0.1:9042 --local_ip_for_outbound_sockets=127.0.0.1 --use_cassandra_authentication=false --ysql_num_shards_per_tserver=2 --default_memory_limit_to_ram_ratio=0.65 --enable_ysql=true --pgsql_proxy_bind_address=127.0.0.1:5433 >"/home/ubuntu/yugabyte-data/node-1/disk-1/tserver.out" 2>"/home/ubuntu/yugabyte-data/node-1/disk-1/tserver.err" &
2020-08-02 10:15:38,873 INFO: Waiting for master and tserver processes to come up.
2020-08-02 10:15:49,111 INFO: PIDs found: {'tserver': [None], 'master': [None]}
2020-08-02 10:15:49,113 ERROR: Failed waiting for master and tserver processes to come up.
^^^ Encountered errors ^^^
I have a test server setup in which i am trying to install yugabyte. Everytime i try to create a cluster it throws an error.On my local server i encountered the same error but when i checked cluster status It shows cluster is created.But on test server when i am checking status it shows no node created.Although yugabyte-data folder is getting created .

0-glusterfs: failed to set volfile server: File exists

my kafka use the glusterfs as the storage, and when i apply the yaml of the kafka, the pod is always in the status of ContainerCreating, then i check the describe of the pod. I get the following err:
Warning FailedMount 24m kubelet, 10.0.0.156 MountVolume.SetUp failed for volume "pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.0.0.154:10.0.0.155:10.0.0.156,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b/kafka-0-glusterfs.log,log-level=ERROR 10.0.0.155:vol_5fcfa0f585ce3677e573cf97f40191d3 /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b
Output: Running scope as unit run-10840.scope.
[2020-03-14 13:56:14.771098] E [glusterfsd.c:825:gf_remember_backup_volfile_server] 0-glusterfs: failed to set volfile server: File exists
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue:
[2020-03-14 13:56:14.782472] E [glusterfsd-mgmt.c:1958:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2020-03-14 13:56:14.782519] E [glusterfsd-mgmt.c:2151:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:vol_5fcfa0f585ce3677e573cf97f40191d3)
Warning FailedMount 24m kubelet, 10.0.0.156 MountVolume.SetUp failed for volume "pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.0.0.154:10.0.0.155:10.0.0.156,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b/kafka-0-glusterfs.log,log-level=ERROR 10.0.0.154:vol_5fcfa0f585ce3677e573cf97f40191d3 /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b
Output: Running scope as unit run-11012.scope.
[2020-03-14 13:56:15.441030] E [glusterfsd.c:825:gf_remember_backup_volfile_server] 0-glusterfs: failed to set volfile server: File exists
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue:
[2020-03-14 13:56:15.452832] E [glusterfsd-mgmt.c:1958:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2020-03-14 13:56:15.452871] E [glusterfsd-mgmt.c:2151:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:vol_5fcfa0f585ce3677e573cf97f40191d3)
Warning FailedMount 24m kubelet, 10.0.0.156 MountVolume.SetUp failed for volume "pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.0.0.154:10.0.0.155:10.0.0.156,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b/kafka-0-glusterfs.log,log-level=ERROR 10.0.0.154:vol_5fcfa0f585ce3677e573cf97f40191d3 /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b
Output: Running scope as unit run-11236.scope.
[2020-03-14 13:56:16.646525] E [glusterfsd.c:825:gf_remember_backup_volfile_server] 0-glusterfs: failed to set volfile server: File exists
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue:
[2020-03-14 13:56:16.658118] E [glusterfsd-mgmt.c:1958:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2020-03-14 13:56:16.658168] E [glusterfsd-mgmt.c:2151:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:vol_5fcfa0f585ce3677e573cf97f40191d3)
Warning FailedMount 24m kubelet, 10.0.0.156 MountVolume.SetUp failed for volume "pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.0.0.154:10.0.0.155:10.0.0.156,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b/kafka-0-glusterfs.log,log-level=ERROR 10.0.0.154:vol_5fcfa0f585ce3677e573cf97f40191d3 /var/lib/kubelet/pods/a32117ca-3ce6-4fc4-b75a-15b63b859b71/volumes/kubernetes.io~glusterfs/pvc-4cebf743-e9a3-4bc0-b96a-e3bca2d7c65b
Output: Running scope as unit run-11732.scope.
How can I solve the problem?
Ensure you have the right name of your volume in the yaml file under path: <the_volume_name>.
To show all gluster volumes use:
sudo gluster volume status all
Restart the volume (in this case my volume is just called gfs):
gluster volume stop gfs
gluster volume start gfs
Now delete your pod and create it again.
Alternatively try Kadlu.io or Ceph Storage.

Error encountered while starting the network in hyperledger

I received this error upon starting my network.
Command Used :
./byfn.sh up
Operating System :
Windows 10 pro
Error :
OCI runtime exec failed: exec failed: container_linux.go:348: starting
container process caused "exec: \"scripts/script.sh\": stat
scripts/script.sh: no such file or directory": unknown ERROR !!!! Test
failed

Puppet Agent Could not retrieve catalog

I installed Maven module in Master machine using this command:
puppet module install maestrodev-maven --version 1.4.0
It installed it successfully in /etc/puppet/modules/
Afterwards I added following code inside the file /etc/puppet/manifests/site.pp of master machine
node 'test02.edureka.com'
{
include maven
}
Now, when I run below command on Puppet Agent machine
puppet agent -t
It gives error:
root#test02:~# puppet agent -t
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: execution expired
Info: Retrieving pluginfacts
Error: /File[/var/lib/puppet/facts.d]: Failed to generate additional resources using 'eval_generate': execution expired
Error: /File[/var/lib/puppet/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet://test01.edureka.com/pluginfacts: execution expired
Info: Retrieving plugin
Error: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate': execution expired
Error: /File[/var/lib/puppet/lib]: Could not evaluate: Could not retrieve file metadata for puppet://test01.edureka.com/plugins: execution expired
Info: Loading facts
Error: JAVA_HOME is not defined correctly.
We cannot execute
Could not retrieve fact='maven_version', resolution='': undefined method `split' for nil:NilClass
Error: Could not retrieve catalog from remote server: execution expired
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: execution expired
root#test02:~#
puppet.conf file on master:
puppet.conf file on agent:
Error screenshot:

Resources