Is it normal to get a lot of heal-failed entries in a gluster mount? - glusterfs

I run:
gluster volume heal myvol info heal-failed
and I get back a whole bunch of entries. Is this normal? Is anyone else out there seeing this in their implementation of glusterfs? If so, how do you go about resolving this?

List of entries from "gluster volume heal myvol info heal-failed" can be real failure or it could just list the entries which self-heal-daemon failed to self-heal in that crawl.
Gradually the file/directory which is listed under "heal-failed" entry would be self-healed by self-heal-daemon.
It is normal to see heal-failed entries.

Related

Why is the load in nodetool status output much less than utilised disk space?

I have 3 node cluster and upon checking the nodetool status; Load is just less than 100 GB on all three nodes. The replication factor is two and percentage own is 65-70% for all three.
However when I inspected the /data directory it is having index.db files for size more than 400 GB and the total size of the keyspace directory is more than 700GB.
Any idea on why a huge gap??
Let me know if any extra details are required :)
PS: nodetool listsnapshots command shows an empty list (No snapshots)
Analysis - we tried redeploying the setup but still the same results; tried researching this topic but no luck.
Expectation - I was expecting this difference in the load and the size of data directory to be negligible if not zero.

ERROR 1777 (HY000): Partition memsqldb:0 has no master instance

I am using community edition of memsql. I got this error while i was running a query today. So i just restarted my cluster and got this error solved.
memsql-ops cluster-restart
But what happened and what should i do in future to avoid this error ?
NOTE
I donot want to buy the Enterprise edition.
Question
Is this problem of Availability ?
I got this error when experimenting with performance.
VM had 24 CPUs and 25 nodes: 1 Master Agg, 24 Leaf nodes
Reduced VM to 4 CPUs and restarted cluster.
All the leaves did not recover.
All except 4 recovered in < 5 minutes.
20 minutes later, 4 leaf nodes still were not connected.
From MySQL/MemSQL prompt:
use db;
show partitions;
I notice some partitions with ordinal from 0-71 for me have null instead Host, Port, Role defined for a given partition.
In memsql ops UI http://server:9000 > Settings > Config > Manual Cluster Control I checked "ENABLE MANUAL CONTROL" while I tried to run various commands with no real benefit.
Then 15 minutes later, I unchecked the box, Memsql-ops tried attaching all the leaf nodes again and was finally successful.
Perhaps a cluster restart would have done the same thing.
This happened because a leaf in your cluster has failed a health check heartbeat for some reason (loss of network connectivity, hardware failure, OS issue, machine overloaded, out of memory, etc.) and its partitions are no longer accessible to query. MemSQL Community Edition only supports redundancy 1 so there are no other copies of the data on the failed leaf node in your cluster (thus the error about missing a partition of data - MemSQL can't complete a query that needs to read data on any partitions on the problem leaf).
Given that a restart repaired things, the most likely answer is that linux "out of memory" killed you: MemSQL Linux OOM killer docs
You can also check the tracelog on the leaf that ran into issues to see if there is any clue there about what happened (It's usually at /var/lib/memsql/leaf_3306/tracelogs/memsql.log)
-Adam
I too have faced this error, that was because some of the slave ordinals had no corresponding masters. My error message looked like:
ERROR 1772 (HY000) at line 1: Leaf Error (10.0.0.112:3306): Partition database `<db_name>_0` can't be promoted to master because it is provisioning replication
My memsql> SHOW PARTITIONS; command returned the following.
So what approach I followed was to remove each of such cases (where the role was either Slave or NULL).
DROP PARTITION <db_name>:4 ON "10.0.0.193":3306;
..
DROP PARTITION <db_name>:46 ON "10.0.0.193":3306;
And then created a new partition with each of the dropped partition.
CREATE PARTITION <db_name>:4 ON "10.0.0.193":3306;
..
CREATE PARTITION <db_name>:46 ON "10.0.0.193":3306;
And this was the result of memsql> SHOW PARTITIONS; after that.
You can refer to the MemSQL Documentation regarding partitions, here if the above steps doesn't seem to solve your problem.
I was hitting the same problem. Using the following command in the master node, solved the problem:
REBALANCE PARTITIONS ON db_name
Optionally you can force it using FORCE:
REBALANCE PARTITIONS ON db_name FORCE
And to see the list of operations when rebalancing is going to execute, use above command with EXPLAIN:
EXPLAIN REBALANCE PARTITIONS ON db_name [FORCE]

Index initializer warning during bootstrap

I'm trying to simultaneously add 4 nodes to my current 2-node DC. I have Vnodes turned off as per Datastax suggestion. Right after the major index build in each node, the following warning is printed several times in the logs:
WARN [SolrSecondaryIndex ks.cf index initializer.] 2014-06-20
09:39:59,904 CassandraUtil.java (line 108) Error Operation timed out -
received only 3 responses. on attempt 1 out of 4 with CL QUORUM...
I understand what it means. But why is Cassandra expecting the nodes to fulfill the CL when these nodes are still bootstrapping? More importantly, how does the warning affect the bootstrap? I noticed that the nodes are not doing any index build or streaming anymore; but they also remained in "Active - Joining" state. Is there any chance that they will finish? What should I do?
I'm using DSE 4.0.3. All existing and new nodes in the DC are Search nodes. I pre-computed the tokens using the python program for MurMur3Partitioner.
EDIT:
Although nodetool compactionstats does not show any on-going index build in the nodes, for some reason, I still see a lot of these lines in the logs:
INFO [IndexPool backpressure thread-0] 2014-06-20 12:30:31,346 IndexPool.java (line 472) Throttling at 26 index requests per second with target total queue size at 40
INFO [IndexPool backpressure thread-0] 2014-06-20 12:30:34,169 IndexPool.java (line 428) Back pressure is active with total index queue size 18586 and average processing time 2770
EDIT:
Interestingly, I found the following lines in each node after digging through the log files:
INFO [main] 2014-06-20 09:39:48,588 StorageService.java (line 1036) Bootstrap completed! for the tokens [node token]
INFO [SolrSecondaryIndex ks.cf index initializer.] 2014-06-20 11:32:07,833 AbstractSolrSecondaryIndex.java (line 411) Reindexing 1417116631 commit log updates for core ks.cf
Based from these lines, I feel a lot safer that the bootstrap actually completed and that the nodes are simply re-indexing their data. I don't know, though, why the re-indexing process is not being shown in nodetool compactionstats.
It appears the bootstrap completed, and the DSE Search system is running normally.
why the re-indexing process is not being shown in nodetool compactionstat
DSE Search is not generally exposed via Cassandra command line tools. The log output should show the indexing as having completed, were you able to verify that?

In Cassandra 1.2 - CQL 3 is it possible to abort a secondary index build?

Been using a 6GB dataset with each source record being ~1KB in length when I accidentally added an index on a column that I am pretty sure has a 100% cardinality.
Tried dropping the index from cqlsh but by that point the two node cluster had gone into a run away death spiral with loadavg surpassing 20 on each node and cqlsh hung on the drop command for 30 minutes. Since this was just a test setup, I shut-down and destroyed the cluster and restarted.
This is a fairly disconcerting problem as it makes me fear a scenario where a junior developer is on a production cluster and they set an index on a similar high cardinality column. I scanned through the documentation and looked at the options in nodetool but there didn't seem to be anything along the lines of "abort job or abort building index".
Test environment:
2x m1.xlarge EC2 instances with 2 Raid 0 ephemeral disks
Dataset was 6GB, 1KB per record.
My question in summary: Is it possible to abort the process of building a secondary index AND or possible to stop/postpone running builds (indexing, compaction) for a later date.
nodetool -h node_address stop index_build
See: http://www.datastax.com/docs/1.2/references/nodetool#nodetool-stop

GDG Roll In Error

While executing one Proc, I am geting a 'GDG Roll In Error'. The Error Message says 'IGD07001I GDG ROLL IN ERROR -RETURN CODE 20 REASON CODE 0 MODULE IGG0CLEG'. The proc is supposed to create 19 generations of a GDG. This error occurs after creating first 6 Generatons. The parameters of the GDG are Limit=100, NOEMPTY,SCRATCH. What could be the reason.?
Experts, Please help.
If you look up IGD07001I it says, among other things, to look at IDC3009I for an explanation of the return and reason codes. For return code 20 reason code 0, IDC3009I says
RETURN CODE 20 Explanation: There is insufficient space in the
catalog to perform the requested update or addition.
The catalog cannot be extended for one of the following reasons:
There is no more space on the volume on which the catalog resides
The maximum number of extents has been reached
The catalog has reached the 4GB limit
There is not enough contiguous space on the volume (required when the catalog's secondary allocation is defined in tracks).
Programmer Response: Scratch unneeded data sets from the volume.
Delete all unnecessary entries from the catalog. The catalog may need
to be reallocated and rebuilt if these steps do not resolve the space
shortage.
I suggest contacting your DFSMS Administrator. I also suggest bookmarking the z/OS documentation for your version/release.

Resources