Cassandra: nodetool repair not working - cassandra

Cassandra service on one of my nodes went down and we couldnt restart it because of some corruption in one of the tables. So we tried rebuilding it by deleting all the data files and then starting the service, once it shows up in the ring we ran nodetool repair multiple times but it got hung throwing the same error
Caused by: org.apache.cassandra.io.compress.CorruptBlockException: (/var/lib/cassandra/data/profile/AttributeKey/profile-AttributeKey-ib-1848-Data.db): corruption detected, chunk at 1177104 of length 11576.
This occurs after 6gb of data is recovered. Also my replication factor is 3 so the same data is fine on the other 2 nodes.
I am a little new to Cassandra and am not sure what I am missing, has anybody seen this issue with repair? I have also tried scrubbing but it failed because of the corruption.
Please help.

rm /var/lib/cassandra/data/profile/AttributeKey/profile-AttributeKey-ib-1848-* and restart.
Scrub should not fail, please open a ticket to fix that at https://issues.apache.org/jira/browse/CASSANDRA.

first use the nodetool scrub if it does not fix
then shut down the node and run sstablescrub [yourkeyspace] [table] you will be able to remove the corrupted tables which were not done at nodetool scrub utility and run a repair you will be able to figure out the issue.

Related

Cassandra lost data after restarting

I meet a problem. I generate some data, and get all of them. It works well. Afterwards, I shutdown all CassandraDaemons and restart them, I fail to get all data because data for some columns were lost. I don't know why this happens. Is there anyone could give me some advice? Thanks very much. By the way, I use Cassandra 2.1, and replication factor is 1.
It seems that Cassandra failed to replay the commitlog when restarting, which causes data loss. But I don't know why. One solution to fix data losss is to force flush data into SSTable using nodetool before killing CassandraDaemons.

Cassandra clean up data and file handlers for deleted tables

After I truncated and dropped a table in Cassandra, I still see the sstables on disk plus the lot of open file handler pointing to these.
What is the proper way to get rid of them?
Is there a possibility without restarting the Cassandra nodes?
We're using Cassandra 3.7.
In Cassandra data does not get removed immediately instead marked as tombstoned. You can run nodetool repairto get rid of deleted data.

Does nodetool cleanup affect Apache Spark rdd.count() of a Cassandra table?

I've been tracking the growth of some big Cassandra tables using Spark rdd.count(). Up 'till now the expected behavior was consistent, the number of rows is constantly growing.
Today I ran nodetool cleanup on one of the seeds and as usual it ran for a 50+ minutes.
And now rdd.count() returns one third of the rows it did before....
Did I destroy data using nodetool cleanup? Or is the Spark count unreliable and was counting ghost keys? I got no errors during cleanup and lots don't show anything out of the usual. It did seem like a successful operation, until now.
Update 2016-11-13
Turns out the Cassandra documentation set me up for the loss of 25+ million rows of data.
The documentation is explicit:
Use nodetool status to verify that the node is fully bootstrapped and
all other nodes are up (UN) and not in any other state. After all new
nodes are running, run nodetool cleanup on each of the previously
existing nodes to remove the keys that no longer belong to those
nodes. Wait for cleanup to complete on one node before running
nodetool cleanup on the next node.
Cleanup can be safely postponed for low-usage hours.
Well you check the status of the other nodes via nodetool status and they are all UP and Normal (UN), BUT here's the catch, you also need to run the command is nodetool describecluster where you might find that the schemas were not synced.
My schemas were not synced and I ran cleanup, when all nodes were UN, up and running normally as per the documentation. The Cassandra documentation does not mention nodetool describecluster after adding new nodes.
So I merrily added nodes, waited till they were UN (Up / Normal) and ran cleanup.
As a result, 25+ million rows of data are gone. I hope this helps others avoid this dangerous pitfall. Basically the Datastax documentation sets you up to destroy data by recommending cleanup as a step of the process of adding new nodes.
In my opinion, that cleanup step should be taken out of the new node procedure documentation altogether. It should be mentioned, elsewhere, that cleanup is a good practice but not in the same section as adding new nodes...it's like recommending rm -rf / as one of the steps for virus removal. Sure will remove the virus...
Thank you Aravind R. Yarram for your reply, I came to the same conclusion as your reply and came here to update this. Appreciate your feedback.
I am guessing you might have either added/removed nodes from the cluster or decreased replication factor before running nodetool cleanup. Until you run the cleanup, I guess Cassandra still reports the old key ranges as part of the rdd.count() as old data still exists on those nodes.
Reference:
https://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsCleanup.html

cassandra nodetool repair/upgrade

I have a cassandra cluster with version 2.0.9 running.
Nodetool hasn't been running since the start (as it was not requested to schedule these repairs).
Each node has around 8GB of data. That seems rather small to me.
When I try to run nodetool repair it seems to take forever (not finished after 2 days).
I don't see any progress. I've been reading threads where they tell you to check compactionstats and netstats but those indicate no traffic. However the nodetool repair command never exits. That doesn't seem normal to me. I got messages about the system keyspace being repaired and being ok. However the actual data we put in it doesn't return anything.
All nodes are up. I've checked in the system.log (CentOS 6 BTW) for errors and there aren't any. I've started a command that checks if the number of commands and responses are still going up (which is the case) however I wonder if this might be from something else or if this is directly linked to the nodetool repair.
There doesn't seem to be any IO/net saturation.
So yesterday I started the repair again with a tool range-repair.py.
The last 12 hours there has been no extra output.
Last output was:
INFO 2015-11-01 20:55:46,268 repair line: 296 : [1/256] repairing range (-09214247901397780884, -09166106147119295777) in 100 steps for keyspace <all>
The main issue with this repair taking forever(or just repair being hung) is that we want to upgrade cassandra for app deployment. The procedure says do a nodetool repair first. Is this actually necessary before you start the upgrade? Maybe nodetool works more efficient (you now also have an incremental option).
Who can help me here? Thanks a lot in advance!
I'm not sure if this fully resolved the issue, however after doing a rolling restart of the whole cluster it seemed that nodetool repair was able to finish on where it didn't before. For another keyspace I got an issue that I had to start the process over and over again to get any progress. I used range_repair.py which allowed me to skip to a certain token so I could slowly go up.
In the end I used dry-run and steps option (1 step) and directed that to a file. Then I filtered the first column with sed and executed that file. If the command seems to hang you can note it down, CTRL-C it and rerun again afterwards. Generally it succeeded the second or third time I ran it.

"Lost notification" from nodetool repair

I'm often seeing the following message when running nodetool repair :
[2015-02-10 16:19:40,042] Lost notification. You should check server log for repair status of keyspace xxx
What does it really mean (and how to prevent it if it's dangerous)?
I'm using Cassandra 2.1.2 in four-node cluster.
This message is not harmful by itself. It only means that the nodetool lost the track of the repair status. It does not affect the repair itself. It may be dangerous if you issue next repair command upon completion of the previous command, therefore resulting in multiple concurrent repairs which produces much higher load on the system. I used to have a script (don't have it any more now) that was monitoring logs for the repair cycle start/finish messages triggered by the "lost notification" message in order not to produce competing repairs.
This seems to be a known bug which already has been fixed in the latest releases.
You can always go, as suggested by the error message, to check cassandra`s system log and collect information about the repair activity.
$ cd /var/log/cassandra/
$ cat system.log | grep repair
Please note that i am testing for some purposes a cassandra 2.1.15 and yet encountered the problem.
As consideration, since it is not a major bug, not really affecting the repair process, i think it will stick around some time.

Resources