I meet a problem. I generate some data, and get all of them. It works well. Afterwards, I shutdown all CassandraDaemons and restart them, I fail to get all data because data for some columns were lost. I don't know why this happens. Is there anyone could give me some advice? Thanks very much. By the way, I use Cassandra 2.1, and replication factor is 1.
It seems that Cassandra failed to replay the commitlog when restarting, which causes data loss. But I don't know why. One solution to fix data losss is to force flush data into SSTable using nodetool before killing CassandraDaemons.
Related
After I truncated and dropped a table in Cassandra, I still see the sstables on disk plus the lot of open file handler pointing to these.
What is the proper way to get rid of them?
Is there a possibility without restarting the Cassandra nodes?
We're using Cassandra 3.7.
In Cassandra data does not get removed immediately instead marked as tombstoned. You can run nodetool repairto get rid of deleted data.
I started to use cassandra 3.7 and always I have problems with the commitlog. When the pc unexpected finished by a power outage for example the cassandra service doesn't restart. I try to start for the command line, but always the error cassandra could not read commit log descriptor in file appears.
I have to delete all the commit logs to start the cassandra service. The problem is that I lose a lot of data. I tried to increment the replication factor to 3, but is the same.
What I can do to decrease amount of lost data?
pd: I only one pc to use cassandra database, it is not possible to add more pcs.
I think your option here is to work around the issue since its unlikely there is a guaranteed solution to prevent commit table files getting corrupted on sudden power outage. Since you only have a single node, it makes it more difficult to recover the data. Increasing the replication factor to 3 on a single node cluster is not going to help.
One thing you can try is to reduce the frequency at which the memtables are flushed. On flush of memtable the entries in the commit log are discarded, therefore reducing the amount of data lost. Details here. This will however not resolve the root issue
When restarting a Cassandra node a lot of time is spend on replaying the commitlog to achieve consistency. In our application, it is more important to bring the node back up and running fast, than to achieve consistency. Therefore we have set “durable_writes = false” on all our manually created keyspaces to disable the commitlog. (We have not touched the system keyspaces). Nevertheless, when we restart a note it still uses about one hour on replaying the commitlog.
What is left in my commitlog?
Can I in any way investigate the content of the commitlog?
How can the commitlog be turned off (if not durable_writes = false)?
durable_writes is set per keyspace, so if there are any keyspaces with it still enabled there will still be mutations in the commitlogs to replay on startup. You may want to walk output of describe schema.
There are some tables (ie system) that you want to keep durable, but it shouldn't have that much to cause an impact to startup. When starting up it logs out which keyspace/tables its reading so you can check which ones its replaying.
One hour is a very long time and has a certain smell to it, there may be something else going on here and probably warrants additional investigation. Some ideas is to check the logs and make sure it is the commitlog replay thats taking time (not rebuilding index summaries or something). Also check that there are not old commit logs that C* doesn't have permissions to delete or something that would stick around.
do 'nodetool drain' before shutting down the node.This will write all the commitlogs to sstables.
I have a single node Cassandra installation on my development machine (and very little experience with Cassandra). I always had very few data in the node and I experienced no problems. I inserted about 9,000 elements in a table today to experiment with a real world use case. When I start up the node the boot time is extremely long now. I get this in system.log
Replaying /var/lib/cassandra/commitlog/CommitLog-3-1388134836280.log
...
Log replay complete, 9274 replayed mutations
That took 13 minutes and is hardly bearable. I wonder if there is a way to store data in such a way that can be read at once without replaying the log. After all 9,000 elements are nothing and there must be a quicker way to boot. I googled for hints and searched into Cassandra's documentation but I didn't find anything. It's obvious that I'm not looking for the right things, would anybody be so kind to point me to the right documents? Thanks.
There are a few things that might help. The most obvious thing you can do is flush the commit log before you shutdown Cassandra. This is a good idea to do in production too. Before I stop a Cassandra node in production I'll run the following commands:
nodetool disablethrift
nodetool disablegossip
nodetool drain
The first two commands gracefully shut down connections to clients connected to this node and then to other nodes in the ring. The drain command flushes memtables to disk (sstables). This should minimize what needs to be replayed on startup.
There are other factors that can make startup take a long time. Cassandra opens all the SSTables on disk at startup. So the more column families and SSTables you have on disk the longer it will take before a node is able to start serving clients. There was some work done in the 1.2 release to speed this up (so if you are not on 1.2 yet you should consider upgrading). Reducing the number of SSTables would probably improve your start time.
Since you mentioned this was a development machine I'll also give you my dev environment observations. On my development machine I do a lot of creating and dropping column families and key spaces. This can cause some of the system CFs to grow significantly and eventually cause a noticeable slowdown. The easiest way to handle this is to have a script that can quickly bootstrap a new database and blow away all the old data in /var/lib/cassandra.
Cassandra service on one of my nodes went down and we couldnt restart it because of some corruption in one of the tables. So we tried rebuilding it by deleting all the data files and then starting the service, once it shows up in the ring we ran nodetool repair multiple times but it got hung throwing the same error
Caused by: org.apache.cassandra.io.compress.CorruptBlockException: (/var/lib/cassandra/data/profile/AttributeKey/profile-AttributeKey-ib-1848-Data.db): corruption detected, chunk at 1177104 of length 11576.
This occurs after 6gb of data is recovered. Also my replication factor is 3 so the same data is fine on the other 2 nodes.
I am a little new to Cassandra and am not sure what I am missing, has anybody seen this issue with repair? I have also tried scrubbing but it failed because of the corruption.
Please help.
rm /var/lib/cassandra/data/profile/AttributeKey/profile-AttributeKey-ib-1848-* and restart.
Scrub should not fail, please open a ticket to fix that at https://issues.apache.org/jira/browse/CASSANDRA.
first use the nodetool scrub if it does not fix
then shut down the node and run sstablescrub [yourkeyspace] [table] you will be able to remove the corrupted tables which were not done at nodetool scrub utility and run a repair you will be able to figure out the issue.