I have a single-node Cassandra setup for my application. To reclaim disk space occupied by deleted records (tombstoned records), I triggered a nodetool compact for my keyspace. Unfortunately, this compaction process got interrupted. Now, when I try to re-start the service, it does not recognise the keyspace (from the data directory configured in cassandra.yaml) for which compaction was in progress when it got interrupted. Other keyspaces like system and system_traces are successfully initiated from the same data directory.
Has anybody encountered a similar issue before? Also, pointers to restore a keyspace only from data files would be of great help (for the lack of maintenance of snapshots).
PS: Upon analysing further it was found that an rm command on the cassandra data directory was issued but immediately cancelled. Most of the data seems to be in place, but there is a chance that the Data.db file of the system keyspace was lost. Is there a way to recover from this state?
Seems like you have corrupted your setup by deleting System keyspace files, hence Cassandra might not be checking the same at boot time.
Try this:
Download same version of cassandra again.
Create your keyspace & cf schemas
Move whatever old data is left to new data directory(cassandra will only load the non-corrupted data) -
sudo mv /data/cassandra_old/data/[keyspace]/[cf]-[md5-old]/* /data/cassandra_new/data/[keyspace]/[cf]-[md5-new]/
It should solve it if I understand the problem correctly.
Related
Edited after reading nodetool tagged questions.
We take snapshots of our single node cassandra database daily. If I want to restore a snapshot either on that node, or on our staging server which is running a different instance of cassandra, my understanding is I have to:
nodetool disablegossip
nodetool disablebinary
nodetool drain
Copy the sstable files from the snapshot directories to the sstable directories under the keyspace directory.
Run nodetool refresh on each table.
Enable binary & gossip.
Is this sufficient to safely bring the snapshot sstable files in without cassandra overwriting them while I'm doing the refresh?
What is the opposite of nodetool drain?
Another edit: What about sstableloader? Should I use that instead? If so, how? I looked at the "documentation" and am none the wiser.
The steps you outlined isn't quite right. You don't shutdown Cassandra and you shouldn't just copy the files on top of the existing SSTables.
At a high level, the steps to restore table snapshots on a node are:
TRUNCATE the table you want to restore (will remove the SSTables from the data directories).
Copy the SSTables from data/ks_name/table-UUID/snapshots/snapshot_name subdirectory into the "live" data directory data/ks_name/table-UUID.
Run nodetool refresh -- ks_name table_name.
You will need to repeat these steps for each application table you want to restore. NOTE: Do NOT restore system tables, only application tables.
The detailed steps are documented in Restoring from a snapshot in Cassandra.
To restore a snapshot into another cluster, I prefer to refer to this as "cloning". The procedure for cloning snapshots to another cluster depends on whether the source and destination clusters have identical configuration.
If both source and destination clusters are identical, follow the steps I documented here -- https://community.datastax.com/questions/4534/. I've explained what identical configuration means in this post.
If they are not identical, follow the steps I documented here -- https://community.datastax.com/questions/4477/. Cheers!
I have a cassandra cluster running in the kubernetes environment, in a namespace, say test1, and I want to test the restore function. So I did a snapshot in the test1 cassandra, moved the snapshot to another node, from these data started a cassandra cluster in another namespace, say test2. The problem was, test2 cassandra cluster replaced test1 cluster totally, Customer's data that should write to the test1 cassandra cluster had written to the test2 cassandra cluster.
An hour later, I noticed this, stopped test2 cassandra cluster, and restarted test1 cassandra cluster, although it has come back to work shortly, but some data was lost.
After a while, I noticed there was some commitlog at that period in the test2 cassandra node, and want to recover these data. Can I just stop the test1 cassandra cluster, put these commitlog files into the commitlog directory of test1 cassandra node, then start cassandra, let cassandra to replay these commitlog ?
Commitlogs from one node can’t be played on a different node or
cluster. They are transactions specific to the node they came from.
source - read the notes ("Important note: A point-in-time restore requires a cluster restart for the commitlog replay to run" and "Some Helpful Notes for Planning")
Later update:
I'm not sure what you mean by "test2 cassandra cluster replaced test1 cluster totally". My assumption is that you restored everything, including system keyspaces.
If you did this, yes, then applying the commit logs might work since besides the IP's and the hostnames, the cluster is sort of the same.
If you look into the CommitLogReader code, you will see that a mutation is considered invalid if an UnknownTableException is thrown (basically if the id of the table is not the same between the commit log and the system keyspace).
I did a similar test on ccm and successfully replayed the commit logs after I changed the id of the table both on file system and in system_schema.tables.
From my perspective your cluster is pretty messed up. Although you could do this and it might work, I think you will always have a high risk of corrupt data.
So, since in the datastax documentation (which we could consider the base documentation for Cassandra) is stated that this operation is supported I am not recommending this.
We have a regular backup of our cluster and we store schema and snapshot back up on aws s3 on daily basis.
Somehow we have lost all the data and while recovering the data from backup we are able to recover schema but while copying snapshots files to /var/lib/cassandra/data directory its not showing up the data in the tables.
After copying the data we have done nodetool refresh -- keyspace table but still nothing is working out.
could you please help on this ?
Im new at Apache Cassandra, but my first focus at this topic was the Backup.
If you want to restore from a Snapshot (on new node/cluster) you have to shut down Cassandra on any node and clear any existing data from these folders:
/var/lib/cassandra/data -> If you want to safe your System Keyspaces so delete only your Userkeyspaces folders
/var/lib/cassandra/commitlog
/var/lib/cassandra/hints
/var/lib/cassandra/saved_cashes
After this, you have to start Cassandra again (the whole Cluster). Create the Keyspace like the one you want to restore and the table you want to restore. In Your Snapshot folder you will find a schema.cql script for the creation of the table.
After Creating the Keyspaces an tables again, wait a moment (time depends on the ammount of nodes in your cluster and keypsaces you want to restore.)
Shut down the Cassandra Cluster again.
Copy the Files from the Snapshot folder to the new folders of the tables you want to restore. Do this on ALL NODES!
After copying the files, start the nodes one by one.
If all nodes are running, run the nodetool repair command.
If you try to check the data via CQLSH, so think of the CONSISTENCY LEVEL! (ALL/QUORUM)
Thats the way, wich work at my Cassandra cluster verry well.
The general steps to follow for restoring a snapshot is:
1.Shutdown Cassandra if still running.
2.Clear any existing data in commitlogs, data and saved caches directories
3.Copy snapshots to relevant data directories
4.Copy incremental backups to data directory (if incremental backups are enabled)
If required, set restore_point_in_time parameter in commitlog_archiving.properties to
restore point.
5.Start Cassandra.
6.Run repair
So try running repair after copying data.
I have below cassandra query ;
Few days ago i have developed application using c# and Single node Cassandra db. While the application in production, power failure occurred and cassandra commitlog got corrupt. Because of it cassandra node not starting, so i have shifted all commitlog files to another directory and started the cassandra node.
Recently i noticed the power failure day's data not available in database, I have all commitlog files with corrupted commitlog file name.
Can you please suggest, is there a way to recover data using commitlog files.
As well how to avoid commitlog file corruption issue, so that in production data loss can be avoid.
Thank you.
There is no way to restore back the node to the previous state if your commit logs have got corrupted and you have no SSTables with you.
If your commit logs are healthy (meaning it's not corrupted), then you just need to restart your node . It will be replayed,as a result will rebuild the memtable(s) and flush generation-1 SSTables on the disk.
What you can ideally do is to forcibly create SSTables.
You can do that under the apache-cassandra/bin directory by
nodetool flush
So if you are wary of losing commit logs .You can rebuild your node to previous states using SSTables so created above using
nodetool.bat refresh [keyspace] [columnfamily].
Alternatively you can also try creating snapshots.
nodetool snapshot
This command will take a snapshot of all keyspaces on the node.You also have the option of creating backups but this one will only keep record of the latest operations.
For more info try reading
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsNodetool_r.html
I suggest you can also try having more nodes and thus increase the replication factor to avoid such scenarios in future.
Hope it helps!
I just setup a fresh windows server with a fresh datastax installation including cassandra 1.2 and opscenter 2.1.3. I've tried finding solutions to these questions on cassandra wikis and datastax website, but I can only find unix specific information or datastax API information.
Cassandra is defaulted to using C: drive (I was never asked to select a drive for cassandra during install).
In the same cassandra instance, can I have keyspaces on separate
disks?
If not, how do I migrate the existing keyspace to the new
drive? (just reconfiguring cassandra.yaml to use a new directory
would lose my opscenter data and may even break opscenter).
If yes, how can I create a new keyspace on a separate drive? cassandra.yaml
seems to only have configuration options for a single store location.
Should I be creating a new cluster to store my data in? If I start
adding new nodes to the default cluster, that will mean the datastax
opscenter data will be getting replicated - that seems like a bad
idea.
If there is good documentation on this somewhere, please point me there.
Thanks,
Adam
You cannot get cassandra to split the keyspaces and store them in different directories. They are all stored under a common data directory that is specified in the cassandra.yaml file.
However, you can set this up and use NTFS to mount different drives under the data directory on your server but this will not be simple or expandable.
If you want to move where the data is stored on cassandra, then stop the cassandra daemon/service, change the cassandra.yaml file to store the data at a new location, then copy/move the entirety of the data directory to this new location. THEN start cassandra back up and it will work fine with the data in the new location. I have done this quite a few times now and cassandra comes back up without incident and no lost data (if you do not move the data, then it will lose it all and recreate the directory structure under the new location).
Data getting replicated is not a bad thing - it is what cassandra was designed for. I don't know what replication factor opscenter uses, but it does not store a massive amount of data so replication is not a problem.