Nodetool flush failed - cassandra

I want to do a backup with snapshot,and when i restore by the backup, i find i lost some data.
Then i had do a test like that: i create a table, when i insert first data and carried out flush,i can see some file generate in data path,but when i insert second data, i carried out flush too,but there is nothing generate in data path
i want each time i insert data and flush i can see there are some file generated in data path.but just first time i can see files generate in data path,after first will be failed.

My hunch is that you're using multiple nodes in your cluster, your RF does not equal the total number of nodes, and because of that, the particular record you're inserting most likely resides on a different node. Because of that, when you flush the node you're on, you don't see any new files generated. If you have multiple nodes in the cluster, you can run "nodetool getendpoints" command (supply the keyspace , table and partition key values). It will tell you which nodes have that partition key (row). The other options is to set your RF=TotalNodeCount. That will ensure the data you insert resides on all nodes. Then you can run flush from any node and you should see a new file generated.
-Jim

Related

cassandra deletes best practices

Looking to reclaim space on a large table. The table has old data which is no longer required and can be deleted. The deletes are based on partition key, there are about 500k partition keys to be deleted.
Would it be better to run the deletes in batches say 50k or 100k in one go? what might be a better batch size (batch here implying how many deletes can be run in one go)?
If the deletes are being run from cqlsh, will cqlsh act as client and connect to diff nodes as coordinator node for each delete or will the node from where cqlsh is started acts as co-ordinator node and all the deletes fired from there?
what are the best practices to run massive deletes/cleanups? any specific dos and donts?
First thing that you need to remember in Cassandra is that deletes really increase disk consumption, not decreasing it, until the compaction happens and old data is deleted. The Last Pickle has a great blog post on that topic.
Regarding your questions:
Batches on different partition keys are heavily increasing a pressure onto coordinator node, so they aren't recommended, especially such big. Prefer to delete one by one
cqlsh always sends commands to the same host (this is enforced by WhiteListPolicy) that acts as coordinator that then forwards traffic to node owning that data.
I would recommend to use external tool, either Spark + Spark Cassandra Connector, or you can use DSBulk to perform deletes as well, by using a custom query, something like this (assuming that you have CSV file with all values for partition column(s) that you want to delete - :pk the name of the column in the header of CSV file, and pk - name of partition column in your schema):
dsbulk load -query "DELETE FROM ks.table WHERE pk = :pk"
In this case DSBulk will correctly send data directly to nodes owning the data, avoiding the pressure on coordinator node.

Is there a way to view data in 2 replicas in Cassandra?

I am a newbie to Cassandra.I have created a keyspace in Cassandra in NetworkTopology Strategy with 2 replicas in one datacenter. Is there a cql command or some other way to view my data in two replicas?
Like SELECT * FROM tablename in replica1 / replica2
Whether there is another way such that I can visually see the data in two replicas?
Thanks in advance.
So your question is not real clear "See the data in 2 replicas". If you ever want to validate your data, you can run some commands to visually see things.
The first thing you'd want to do is log onto the node you want to investigate. Go to the data directory of the interested table -> DataDir/keyspace/table. In there you'll see one or more files that look like *Data.db. Those are your sstables. Data in memory is flushed to sstables in certain scenarios. You want to be sure your data is flushed from memory to disk if you're validating (as you may not find what you're looking for otherwise). To do that, you issue a "nodetool flush" command (you can use the keyspace and table as parameters if you only want to flush the specific table).
Like I said, after that, everything in memory would be flushed to disk. So you'd be able to see your sstables (again, *Data.db) files. Once you have those sstables, you can run the "sstabledump" command on each sstable to see the data that resides in them, thus validating your data.
If you have only a few rows you want to validate and a lot of nodes, you can find which node the rows would reside by running "nodetool getendpoints" with the keyspace, table, and partition key. That will tell you every node that will have the data. That way you're not guessing which node the row(s) should be on. Unfortunately, there is no way to know which sstable the rows should exist in (and it could be more than one if updates/deletes, etc. occurred). You'll have to go through each sstable on the specific node(s).
Hope that helps answer your question?
Good luck.
-Jim
You can for a specific partition. If you are sure host1 is a replica (nodetool getendpoints or from query trace), then if you make your query with CL.ONE and explicitly to that host, the coordinator will always pick local first. So
Statement q = new SimpleStatement("SELECT * FROM tablename WHERE key = X");
q.setHost("host1")
Where host1 owns X.
For SELECT * FROM tablename its a bit harder because you are looking over entire data set and coordinator will send out multiple queries for each part of ring. If you do some queries with CL.ONE it will still only go to one node for each part of that range so if you set q.enableTracing() you can see what node answered for each range. You have no control over which coordinator picks so may take few queries.
If you just want to see if theres differences you can use preview repair. nodetool repair --preview --full.

Cassandra: If TRUNCATE table and restore backup for only one node, will I loose data?

Suppose I have [3 nodes - 1 datacenter - 1 cluster] cassandra setup.
A keysapce with replication factor = 2
I am taking regular snapshots and incremental backups for all nodes.
One of my 3 node goes completely down with whatever reason and I want to restore backup.
Cassandra(datastax) documentation suggests to usually TRUNCATE table before restoring.
Question:
As I am only going to restore backup on one node, is TRUNCATE necessary? Because truncate will delete that table's data from ALL nodes as per my understanding. TRUNCATE Doc
So if I truncate table and restore backup only on one node, then wouldn't I loose data for that table which was stored on other nodes too?
First of all, in your scenario, you might not want to restore a backup at all. Since you have replication factor = 2, your data is still on one other node of the original three. Therefore, you could remove the node that went completely down and add it again. Cassandra will automatically get it up to speed and stream the data to it.
Alternatively or complementary, you can stream the data files from the backup into your cluster with SSTableLoader.
A few other points though for the sake of knowledge:
Truncate removes the data in the table on all nodes.
If you truncate the table and then restore on only one node, you will lose data with replication factor = 2 and three nodes.
In your case, Truncate is not necessary
Why Truncate?
Truncate is recommended in certain scenarios because the data you restore will have older timestamps than the new data.
The example in the link you sent is rather apt to explain one of those scenarios.
If you accidently delete a lot of data and wish to restore your old data, you will need to remove the tombstones which mark those rows as deleted by truncating first.

Cassandra data directory does not get updated with deletion

Currently, I am bench marking Cassandra database using YCSB framework. During this time I have performed (batch) insertion and deletion of the data quite regularly.
I am using Truncate command to delete keyspace rows. However, I am noticing that my Cassandra data directory swells up as the experiments.
I have checked and can confirm that even there is no data in the keystore when I checked the size of data directory. Is there a way to initialize a process so that Cassandra automatically release the stored space, or does it happen over time.
When you use Truncate cassandra will create snapshots of your data.
To disable it you will have to set auto_snapshot: false in cassandra.yaml file.
If you are using Delete, then cassandra use tombstone,i.e your data will not get deleted immediately. Data will get deleted once compaction is ran.
To remove previous snapshots one can use nodetool snapshot command.

Cassandra - avoid nodetool cleanup

If we have added new nodes to a C* ring, do we need to run "nodetool cleanup" to get rid of the data that has now been assigned elsewhere? Or is this going to happen anyway during normal compactions?
During normal compactions, does C* remove data that does no longer belong on this node, or do we need to run "nodetoool cleanup" for that? Asking because "cleanup" takes forever and crashes the node before finishing.
If we need to run "nodetool cleanup", is there a way to find out which nodes now have data they should no longer own? (i.e data that now belongs on the new nodes, but is still present on the old nodes because no one removed it. This is the data that "nodetool cleanup" would remove.) We have RF=3 and two data centers, each of which has a complete copy of the data. I assume we need to run cleanup on all nodes in the data center where we have added nodes, because each row on the new node used to be on another node (primary), plus two copies (replicas) on two other nodes.
If you are on Apache Cassandra 1.2 or newer, cleanup checks the meta data on files so that it only does something if it needs to. So you are safe to just run it on every node, and only those nodes with extra data will do something. The data will not be removed during the normal compaction process, you have to call cleanup to remove it.
What I found helpful is to just compare how much space each node occupies in the data folder (for me it was /var/lib/cassandra/data). Some things like snapshots might differ between the nodes but when you see that newer nodes use much less disk space than older ones it might be because they did not have a cleanup after the newer ones where added. While you are there, you can also check what is the biggest .db file in there and check if your storage is has enough free space to store another file of that size The cleanup seems to copy the data of the .db files into new ones, minus the data that is now on other nodes. So you might need that extra space while it runs.

Resources