Heketi can't provision a volume for Heketi database - glusterfs

I'm trying to make glusterfs cluster with Heketi for Kubernetes persistent volumes. I have 3 nodes in gluster cluster:
heketi-cli node list
Id:242e801e6eeb7ec10acda60a409b5d98 Cluster:fd539c5d13b6229498c6c67ac491163d
Id:439fb090888a745633f9db6ac4d243b8 Cluster:fd539c5d13b6229498c6c67ac491163d
Id:5e9b7e5f3ec33c77c42437e89ca857a3 Cluster:fd539c5d13b6229498c6c67ac491163d
But when I try to provision a volume for Heketi database by using command:
heketi-cli setup-openshift-heketi-storage
I get an error:
Error: No space
But I have enough free space on my volumes:
Devices:
Id:931b4f87e3675368a4f737ed6862e0cf Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
Devices:
Id:3a2a30b22ade4efca7949e9cc082b685 Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
Devices:
Id:5d1b5c7b258c52569bff1e1c720015c5 Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
What can be the reason for this strange behavior?

I'm sorry, I have found the reason. It's the count of gluster node, it should be equal to count of gluster instances in kubernetes. In previous turn I had only 3 gluster nodes and 4 gluster instances in kubernetes.

There can be a number of problems that lead to this error message. The 2 most common ones are:
You do not have the minimum of 3 nodes in your gluster cluster
The heketi-cli setup-openshift-heketi-storage command needs to create a volume for heketi's database. That volume is now 2GB by default but it used to 32GB(!) (see heketi issue #639). So depending on your heketi-cli version it may be trying to create a 32GB volume on your 29GB bricks. Nasty.
I suggest you look at the logs of heketi:
$ kubectl get pod -l name=heketi
NAME READY STATUS RESTARTS AGE
heketi-703226055-7g3hb 1/1 Running 0 18h
$ kubectl logs heketi-703226055-7g3hb -f
Heketi v3.0.0-111-gc5f0f58
[heketi] INFO 2017/02/14 22:17:53 Loaded kubernetes executor
...

Related

Glusterfs geo-replication issue

I have been using georep for the last two months and posted this on their GitHub but no answers so far.
Description of problem: after copying ~8TB without any issue, some nodes are flipping between Active and Faulty with the following error message in gsync log:
ssh> failed with UnicodeDecodeError: 'ascii' codec can't decode byte 0xf2 in position 60: ordinal not in range(128).
Default encoding in all machines is utf-8
Command to reproduce the issue:
gluster volume georeplication master_vol user#slave_machine::slave_vol start
The full output of the command that failed:
The command itself it's fine but you need to start it to fail, hence the command it's not the issue on it's own
Expected results:
No such failures, copy should go as planned
Mandatory info:
The output of the gluster volume info command:
Volume Name: volname
Type: Distributed-Replicate
Volume ID: d5a46398-9638-4b50-9db0-4cd7019fa526
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x 2 = 24
Transport-type: tcp
Bricks: 24 bricks (omited the names cause not relevant and too large)
Options Reconfigured:
features.ctime: off
cluster.min-free-disk: 15%
performance.readdir-ahead: on
server.event-threads: 8
cluster.consistent-metadata: on
performance.cache-refresh-timeout: 1
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
performance.flush-behind: off
performance.cache-size: 5GB
performance.cache-max-file-size: 1GB
performance.io-thread-count: 32
performance.write-behind-window-size: 8MB
client.event-threads: 8
network.inode-lru-limit: 1000000
performance.md-cache-timeout: 1
performance.cache-invalidation: false
performance.stat-prefetch: on
features.cache-invalidation-timeout: 30
features.cache-invalidation: off
cluster.lookup-optimize: on
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
storage.owner-uid: 33
storage.owner-gid: 33
features.bitrot: on
features.scrub: Active
features.scrub-freq: weekly
cluster.rebal-throttle: lazy
geo-replication.indexing: on
geo-replication.ignore-pid-check: on
changelog.changelog: on
The output of the gluster volume status command:
Don't really think this is relevant as everything seems fine, if needed I'll post it
The output of the gluster volume heal command:
Same as before
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
Not the relevant ones as is georep, posting the exact issue: (this log is from master volume node)
[2022-09-23 09:53:32.565196] I [master(worker /bricks/brick1/data):1439:process] _GMaster: Entry Time Taken [{MKD=0}, {MKN=0}, {LIN=0}, {SYM=0}, {REN=0}, {RMD=0}, {CRE=0}, {duration=0.0000}, {UNL=0}]
[2022-09-23 09:53:32.565651] I [master(worker /bricks/brick1/data):1449:process] _GMaster: Data/Metadata Time Taken [{SETA=0}, {SETX=0}, {meta_duration=0.0000}, {data_duration=1663926812.5656}, {DATA=0}, {XATT=0}]
[2022-09-23 09:53:32.566270] I [master(worker /bricks/brick1/data):1459:process] _GMaster: Batch Completed [{changelog_end=1663925895}, {entry_stime=None}, {changelog_start=1663925895}, {stime=(0, 0)}, {duration=673.9491}, {num_changelogs=1}, {mode=xsync}]
[2022-09-23 09:53:32.668133] I [master(worker /bricks/brick1/data):1703:crawl] _GMaster: processing xsync changelog [{path=/var/lib/misc/gluster/gsyncd/georepsession/bricks-brick1-data/xsync/XSYNC-CHANGELOG.1663926139}]
[2022-09-23 09:53:33.358545] E [syncdutils(worker /bricks/brick1/data):325:log_raise_exception] : connection to peer is broken
[2022-09-23 09:53:33.358802] E [syncdutils(worker /bricks/brick1/data):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-GcBeU5/38c083bada86a45a28e6710377e456f6.sock geoaccount#slavenode6 /usr/libexec/glusterfs/gsyncd slave mastervol geoaccount#slavenode1::slavevol --master-node masternode21 --master-node-id 08c7423e-c2b6-4d40-adc8-d2ded4f66608 --master-brick /bricks/brick1/data --local-node slavenode6 --local-node-id bc1b3971-50a7-4b32-a863-aaaa02419de6 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 12}, {error=1}]
[2022-09-23 09:53:33.358927] E [syncdutils(worker /bricks/brick1/data):851:logerr] Popen: ssh> failed with UnicodeDecodeError: 'ascii' codec can't decode byte 0xf2 in position 60: ordinal not in range(128).
[2022-09-23 09:53:33.672739] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2022-09-23 09:53:45.477905] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
**- Is there any crash ? Provide the backtrace and coredump
Provided log up
Additional info:
Master volume: 12x2 Distributed-replicated setup, been working for a couple years no, no big issues as of today. 160TB of Data
Slave volume: 2x(5+1) Distributed-disperse setup, created exclusively to be a slave georep node. Managed to copy 11TB of data from master node, but it's failing.
The operating system / glusterfs version:
On ALL nodes: Glusterfs version= 9.6
Master nodes OS: CentOS 7
Slave nodes OS: Debian11
Extra questions
Don't really know if it's the place to ask this, but while we're at it, any guidance as of how to improve sync performance? Tried changing the parameter sync_jobs up to 9 (from 3) but as we've seen (while it was working) it'd only copy from 3 nodes max, at a "low" speed (about 40% of our bandwidth). It could go as high as 1Gbps but the max we got was 370Mbps.
Also, is there any in-depth documentation for georep? The basics we found were too basic and we did miss more doc to read and dig up into.

ext4 signature detected on /dev/sdb1 at offset 1080 while creating Gluster cluster

I am getting attached error while creating Gluster cluster on Kubernetes. Can someone please advice why this error is coming and how to resolve it?
This is solved by by using the command wipefs -a /dev/sdb1. It wipes the signature present on newly created partition.

How can I inspect per executor/node memory usage metrics of a pyspark job on Dataproc?

I'm running a PySpark job in Google Cloud Dataproc, in a cluster with half the nodes being preemptible, and seeing several errors in the job output (the driver output) such as:
...spark.scheduler.TaskSetManager: Lost task 9696.0 in stage 0.0 ... Python worker exited unexpectedly (crashed)
...
Caused by java.io.EOFException
...
...YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 177 for reason Container marked as failed: ... Exit status: -100. Diagnostics: Container released on a *lost* node
...spark.storage.BlockManagerMasterEndpoint: Error try to remove broadcast 3 from block manager BlockManagerId(...)
Perhaps by coincidence, the errors mostly seem to be coming from preemptible nodes.
My suspicion is that these opaque errors are coming from the node or executors running out of memory, but there don't seem to be any granular memory related metrics exposed by Dataproc.
How can I determine why a node was considered lost? Is there a way I can inspect memory usage per node or executor to validate whether these errors are being caused by high memory usage? If YARN is the one which is killing containers / determining nodes are lost, then hopefully there's a way to introspect why?
Because you are using Preemptible VMs which are short-lived and guaranteed to last for up to 24 hours. This means that when GCE shutdowns Preemptible VMs you see errors like this:
YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 177 for reason Container marked as failed: ... Exit status: -100. Diagnostics: Container released on a lost node
Open a secure shell from your machine to the cluster. You'll require gcloud sdk installed for that.
gcloud compute ssh ${HOSTNAME}-m --project=${PROJECT}
Then run the following commands in the cluster.
List all nodes in the cluster
yarn node -list
Then using ${NodeID} to get report on the node state.
yarn node -status ${NodeID}
You could also set up local port forwarding via SSH to Yarn WebUI server instead of running commands directly in the cluster.
gcloud compute ssh ${HOSTNAME}-m \
--project=${PROJECT} -- \
-L 8088:${HOSTNAME}-m:8088 -N
Then go to http://localhost:8088/cluster/apps in your browser.

cassandra service (3.11.5) stops automaticall after it starts/restart on AWS linux

cassandra service (3.11.5) stops automatically after it starts/restart on AWS linux.
I have fresh installation of cassandra on new instance of AWS linux (t3.xlarge) and
sudo service cassandra start
or
sudo service cassandra restart
after 1 or 2 seconds, the service stop automatically. I looked into logs and I found these.
I am not sure, I havent change configs related to snitch and its always SimpleSnitch. I dont have any multiple cassandras. Just only on single EC2.
Logs
INFO [main] 2020-02-12 17:40:50,833 ColumnFamilyStore.java:426 - Initializing system.schema_aggregates
INFO [main] 2020-02-12 17:40:50,836 ViewManager.java:137 - Not submitting build tasks for views in keyspace system as storage service is not initialized
INFO [main] 2020-02-12 17:40:51,094 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
ERROR [main] 2020-02-12 17:40:51,137 CassandraDaemon.java:759 - Cannot start node if snitch's data center (datacenter1) differs from previous data center (dc1). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
Installation steps
sudo curl -OL https://www.apache.org/dist/cassandra/redhat/311x/cassandra-3.11.5-1.noarch.rpm
sudo rpm -i cassandra-3.11.5-1.noarch.rpm
sudo pip install cassandra-driver
export CQLSH_NO_BUNDLED=true
sudo chkconfig --levels 3 cassandra on
The issue is in your log file:
ERROR [main] 2020-02-12 17:40:51,137 CassandraDaemon.java:759 - Cannot start node if snitch's data center (datacenter1) differs from previous data center (dc1). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
It seems that you started the cluster, stopped it and renamed the datacenter from dc1 to datacenter1.
In order to fix:
If no data is stored, delete the data directories
If data is stored, rename the datacenter back to dc1 in the config
I had the same problem , where cassandra service immediately stops after it was started.
in the cassandra configuration file located at /etc/cassandra/cassandra.yaml change the cluster_name to the previous one, like this:
...
# The name of the cluster. This is mainly used to prevent machines in
# one logical cluster from joining another.
cluster_name: 'dc1'
# This defines the number of tokens randomly assigned to this node on the ring
# The more tokens, relative to other nodes, the larger the proportion of data
...

Error mounting azure vhd to kubernetes pod

On kubernetes v1.4.3 I'm trying to mount the azure disk (vhd) to a pod using following configuration:
volumes:
- name: "data"
azureDisk:
diskURI: "https://testdevk8disks685.blob.core.windows.net/vhds/test-disk-01.vhd"
diskName: "test-disk-01"
But it returns following error while creating pod
MountVolume.SetUp failed for volume "kubernetes.io/azure-disk/0a0e1c0f-9b7a-11e6-8cc5-000d3a32f480-data" (spec.Name: "data") pod "0a0e1c0f-9b7a-11e6-8cc5-000d3a32f480" (UID: "0a0e1c0f-9b7a-11e6-8cc5-000d3a32f480") with: mount failed: exit status 32
Mounting arguments: /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/falkonry-dev-k8-ampool-locator-01 /var/lib/kubelet/pods/0a0e1c0f-9b7a-11e6-8cc5-000d3a32f480/volumes/kubernetes.io~azure-disk/data [bind]
Output: mount: special device /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/test-disk-01 does not exist
There was a bug in v1.4.3 which was the cause of this problem. The bug has been solved in v1.4.7+. Upgrading the kubernetes cluster to appropriate version solved the problem.

Resources