Cassandra hot keyspace structure change

Cassandra hot keyspace structure change - cassandra

I'm currently running a 12-node Cassandra cluster storing 4TB of data, with a replication factor set to 3. For the needs of an application update, we need to change the configuration of our keyspace, and we'd like to avoid any downtime if possible.
I read on a mailing list that the best way to do it is to:
Kill cassandra process on one server of the cluster
Start it again, wait for the commit log to be written on the disk, and kill it again
Make the modifications in the storage.xml file
Rename or delete files in the data directories according to the changes we made
Start cassandra
Goto 1 with next server on the list
My questions would be:
Did I understand the process well?
Is there any risk of data corruption?
During the process, there will be servers with different versions of the storage.xml file in the same cluser, same keyspace. Is it a problem?
Same question as above if we not only add, rename and remove ColumnFamilies, but if we change the CompareWith parameter / transform an existing column family into a super one. Or do we need to change the name?
Thank you for your answers. It's the first time I'll do this, and I'm a little bit scared.

Your list looks like the one in http://wiki.apache.org/cassandra/FAQ#modify_cf_config. So it should be accurate...

Related

Is it possible to backup and restore Cassandra cluster using dsbulk?

I searched through the internet a lot and saw a lot of ways to backup and restore a Cassandra cluster, such as nodetool snapshot and Medusa. but my question is that can I use dsbulk to backup a Cassandra cluster. What are its limitations? Why doesn't anyone suggest that?

It's possible to use it in some cases, but it's not practical because (that are primary, list could be bigger):
DSBulk put an additional load onto the cluster nodes because it's going through the standard read path. In contrast to that nodetool snapshot just create a hardlinks to the files with data, no additional load to the nodes
It's harder to implement incremental backups with DSBulk - you need to come with condition for SELECT that will find only data that changed since the last backup, so you need to have timestamp column, because you can't do the WHERE condition on the value of writetime function. Plus it will require rescanning of whole data anyway. Plus it's impossible to find what data were deleted. With nodetool snapshot, you just compare what files has changed since last backup, and backup only them.

Is there a way to view data in 2 replicas in Cassandra?

I am a newbie to Cassandra.I have created a keyspace in Cassandra in NetworkTopology Strategy with 2 replicas in one datacenter. Is there a cql command or some other way to view my data in two replicas?
Like SELECT * FROM tablename in replica1 / replica2
Whether there is another way such that I can visually see the data in two replicas?
Thanks in advance.

So your question is not real clear "See the data in 2 replicas". If you ever want to validate your data, you can run some commands to visually see things.
The first thing you'd want to do is log onto the node you want to investigate. Go to the data directory of the interested table -> DataDir/keyspace/table. In there you'll see one or more files that look like *Data.db. Those are your sstables. Data in memory is flushed to sstables in certain scenarios. You want to be sure your data is flushed from memory to disk if you're validating (as you may not find what you're looking for otherwise). To do that, you issue a "nodetool flush" command (you can use the keyspace and table as parameters if you only want to flush the specific table).
Like I said, after that, everything in memory would be flushed to disk. So you'd be able to see your sstables (again, *Data.db) files. Once you have those sstables, you can run the "sstabledump" command on each sstable to see the data that resides in them, thus validating your data.
If you have only a few rows you want to validate and a lot of nodes, you can find which node the rows would reside by running "nodetool getendpoints" with the keyspace, table, and partition key. That will tell you every node that will have the data. That way you're not guessing which node the row(s) should be on. Unfortunately, there is no way to know which sstable the rows should exist in (and it could be more than one if updates/deletes, etc. occurred). You'll have to go through each sstable on the specific node(s).
Hope that helps answer your question?
Good luck.
-Jim

You can for a specific partition. If you are sure host1 is a replica (nodetool getendpoints or from query trace), then if you make your query with CL.ONE and explicitly to that host, the coordinator will always pick local first. So
Statement q = new SimpleStatement("SELECT * FROM tablename WHERE key = X");
q.setHost("host1")
Where host1 owns X.
For SELECT * FROM tablename its a bit harder because you are looking over entire data set and coordinator will send out multiple queries for each part of ring. If you do some queries with CL.ONE it will still only go to one node for each part of that range so if you set q.enableTracing() you can see what node answered for each range. You have no control over which coordinator picks so may take few queries.
If you just want to see if theres differences you can use preview repair. nodetool repair --preview --full.

Loading Cassandra data with SStableloader from different Cassandra cluster

I have two different independent machines running Cassandra and I want to migrate the data from one machine to the other.
Thus, I first took a snapshot of my Cassandra Cluster on machine 1 according to the datastax documentation.
Then I moved the data to machine 2, where I'm trying to import it with sstableloader.
As a note: The keypsace (open_weather) and tablename (raw_weather_data) on the machine 2 have been created and are the same as on machine 1.
The command I'm using looks as follows:
bin/sstableloader -d localhost "path_to_snapshot"/open_weather/raw_weather_data
And then get the following error:
Established connection to initial hosts
Opening sstables and calculating sections to stream
For input string: "CompressionInfo.db"
java.lang.NumberFormatException: For input string: "CompressionInfo.db"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:276)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:235)
at org.apache.cassandra.io.sstable.Component.fromFilename(Component.java:120)
at org.apache.cassandra.io.sstable.SSTable.tryComponentFromFilename(SSTable.java:160)
at org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:84)
at java.io.File.list(File.java:1161)
at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:78)
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:162)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
Unfortunately I have no idea why?
I'm not sure if it is related to the issue, but somehow on machine 1 my *.db files are name rather "strange" as compared to the *.db files I already have on machine 2.
*.db files from machine 1:
la-53-big-CompressionInfo.db
la-53-big-Data.db
...
la-54-big-CompressionInfo.db
...
*.db files from machine 2:
open_weather-raw_weather_data-ka-5-CompressionInfo.db
open_weather-raw_weather_data-ka-5-Data.db
What am I missing? Any help would be highly appreciated. I'm also open to any other suggestions. The COPY command will most probably not work since it is Limited to 99999999 rows as far as I know.
P.s. I didn't want to create a overly huge post, but if you need any further information to help me out, just let me know.
EDIT:
Note that I'm using Cassandra in the stand-alone mode.
EDIT2:
After installing the same version 2.1.4 on my destination machine (machine 2), I still get all the same error. With SSTableLoader I still get the above mentioned error and with copying the files manually (as described by LHWizard), I still get empty tables after starting Cassandra again and performing a SELECT command.
Regarding the initial tokens, I get a huge list of tokens if I perform node ring on machine 1. I'm not sure what to do with those?

your data is already in the form of a snapshot (or backup). What I have done in the past is the following:
install the same version of cassandra on the restore node
edit cassandra.yaml on the restore node - make sure that cluster_name and snitch are the same.
edit seeds: list and any other properties that were altered in the original node.
get the schema from the original node using cqlsh DESC KEYSPACE.
start cassandra on the restore node and import the schema.
(steps 6 & 7 may not be completely necessary, but this is what I do.)
stop cassandra, delete the contents of /var/lib/cassandra/data/, commitlog/, and saved_caches/* folders.
restart cassandra on the restore node to recreate the correct folders, then stop it
copy the contents of the snapshots folder to each corresponding table folder in the restore node, then start cassandra. You probably want to run nodetool repair.
You don't really need to bulk import the data, it's already in the correct format if you are using the same version of cassandra, although you didn't specify that in your original question.

Cassandra - avoid nodetool cleanup

If we have added new nodes to a C* ring, do we need to run "nodetool cleanup" to get rid of the data that has now been assigned elsewhere? Or is this going to happen anyway during normal compactions?
During normal compactions, does C* remove data that does no longer belong on this node, or do we need to run "nodetoool cleanup" for that? Asking because "cleanup" takes forever and crashes the node before finishing.
If we need to run "nodetool cleanup", is there a way to find out which nodes now have data they should no longer own? (i.e data that now belongs on the new nodes, but is still present on the old nodes because no one removed it. This is the data that "nodetool cleanup" would remove.) We have RF=3 and two data centers, each of which has a complete copy of the data. I assume we need to run cleanup on all nodes in the data center where we have added nodes, because each row on the new node used to be on another node (primary), plus two copies (replicas) on two other nodes.

If you are on Apache Cassandra 1.2 or newer, cleanup checks the meta data on files so that it only does something if it needs to. So you are safe to just run it on every node, and only those nodes with extra data will do something. The data will not be removed during the normal compaction process, you have to call cleanup to remove it.

What I found helpful is to just compare how much space each node occupies in the data folder (for me it was /var/lib/cassandra/data). Some things like snapshots might differ between the nodes but when you see that newer nodes use much less disk space than older ones it might be because they did not have a cleanup after the newer ones where added. While you are there, you can also check what is the biggest .db file in there and check if your storage is has enough free space to store another file of that size The cleanup seems to copy the data of the .db files into new ones, minus the data that is now on other nodes. So you might need that extra space while it runs.

Duplicate Cassandra node from one node to two nodes without repair?

Is there an easy way to move from having a single node to two nodes that both have a full copy of the data. In the past when I added a node, I had to wait for the second node to take half of the data and then I had to muck around to make them full replicas again.

Yes, you can copy over the sstables directly and that way when the node starts up, it can access its data directly without needing streaming. This is how it should work: go to your data folder on your old node and copy over directories matching for the keyspaces you want to migrate, as well as schema directories inside system. Copy these over to your new node's data directory (create it if necessary). Then configure the new node and have it join the cluster as usual. I believe you'll have access to your new data at this point without further migrations.
See a blog post on this topic for more details, especially on using a more refined approach with the help of sstableloader tool.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string