How to migrate data from Cassandra cluster of size N to a different cluster of size N+/-M - cassandra

I'm trying to figure out how to migrate data from one cassandra cluster, to another cassandra cluster of a different ring size...say from a 5 node cluster to a 7 node cluster.
I started looking at sstable2json, since it creates a json file for the SSTable on that specific cassandra node. My thought was to do this for a column family on each node in the ring. So on a 5 node ring, this would give me 5 json files, one file for the data stored on in the column family that resides on each node.
Then I'd merge the json files into one file, and use json2sstable to import into a new cluster, of size, lets say 7. I was hoping that cassandra would then replicate/balance the data out evenly across the nodes in the ring, but I just read that SSTables are immutable once written. So if I did what I just mentioned, I'd end up with a ring with all the data in my column family on one node.
So can anyone help me figure out the process for migrating data from one cluster to a different cluster of a different ring size?

Better: use bin/sstableloader on the sstables from the old ring, to stream to the new one.
Normally sstableloader is used in a sequence like this:
Create sstables locally using SSTableWriter
Use sstableloader to stream the data in the sstables to the right nodes (bin/sstableloader path-to-directory-full-of-sstables). The directory name is assumed to be the keyspace, which will be the case if you point it at an existing Cassandra data directory.
Since you're looking to stream data from an existing cluster A to a new cluter B, you can skip straight to running sstableloader against the data on each node in cluster A.
More details on using sstableloader in this blog post.

You don't need to use sstable2json. If you have the space you can:
get all the sstables from all of the nodes on the old ring
put them all together on each of the new servers (renaming any which have the same names)
run nodetool cleanup on each node in the new ring and they will throw away the data that doesn't belong to them.

You may do some steps as following:
1. Join 7 nodes into 5 nodes clusters and set up each node with its own ring token. At this time, you may have a cluster with 12 nodes.
2. Remove 5 nodes from new cluster in step 1.
3. Set up the token ring for each node after moving 5 nodes in your own.
4. Repairing the 7 nodes cluster.

I would venture to say that this isn't as big of a problem as it may seem.
Create your new ring and define the tokens for each node appropriately as per http://wiki.apache.org/cassandra/Operations#Token_selection
Import data into the new ring.
The ring will balance itself based on the tokens you have defined http://wiki.apache.org/cassandra/Operations#Import_.2BAC8_export

Related

Data is replicated/copied on my 2nd node even with a replication factor of 1 for the key-space

I have a Cassandra cluster of 3 nodes and I create a keyspace 'abcd' using SimpleStrategy and ReplicationFactor 1. Since I have chosen RF as 1, I assume that any writes to my node-1 should not be replicated across the other 2 nodes.
But when I inserted a record into keyspace/table, I saw this new row is getting inserted in to all nodes in my cluster.
My question is since I have chosen RF as 1 for this keyspace, I would have expected only one node (i.e. node-1) in this cluster should have owned this data, not the rest of the nodes.
Pease correct me if my understanding is wrong.
Since your RF is 1, your data is getting written to only one node. But you can access that data from running the select query from other nodes also as any node in a Cassandra cluster is able to access all the data present in Cluster.
If the node from which you are running the query does not have the data, it will fetch the data from other nodes and display the result.
You can check which exact node has the data by running nodetool getendpoints.
You will need to mention your keyspace, table name and partition key.

Maintaining RF when node fails

Does Cassandra maintains RF when a node goes down. For e.g. if number of nodes is 5 and RF is 2 then when a single node goes down, does the remaining replica copies it's data to some other node to maintain the RF of 2?
In the Datastax's documentation, it's mentioned that "If a node fails, the load is spread evenly across other nodes in the cluster". Does this mean that migration of data happens when a node goes down? Is this a feature available only in Datastax's Cassandra and not Apache Cassandra?
No, instead a "hint" will be stored in the coordinator node and will get eventually written to the node which owns the token range when the node comes back up - the write will succeed depending on your consistency level. So in the above example the write will succeed if you are writing with consistency level as ONE.
If the node is down only for short period - the node will receive the data back from hints from other nodes when it comes back. But if you decommission a node, then the data gets replicated to other nodes and the other nodes will have the new token ranges (same case when a node is added to the cluster as well).
Over time the data in one replica can become inconsistent with others and the repair process helps Cassandra in fixing them - https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesTOC.html
This is applicable in Apache Cassandra as well.

Cassandra: For a single node cluster, will keyspace replication factor >1 increase disk space usage?

I have a keyspace with replication factor set to 3 but I have only a single node. Will then the disk space be used 3 times the data size? As the replicas are not yet assigned to any other nodes, will cassandra stop creating replicas unless new nodes join the cluster?
No, the disk space used would not be three times the size. The single node would own the entire token range and all writes would be written to that single node once.
What happens with the writes for the other two replicas would depend on if those nodes were previously present in the cluster and are currently down, or if they have never been added to the cluster. If they had never been added, then C* would just skip trying to write to them.
If they had been added but are currently down, and if you have hinted handoffs enabled and are still within the hinted handoff window, then C* will store hints for the down nodes on the single up node.
It depends on the replication strategy you have used . Assuming your queries are working you might have used SimpleStrategy , if you try to write to such a configuration your write should fail as it needs to write to 2 additional replica node before it gives a acknowledgement to the client ,which in case of SimpleStratagy are the next two clockwise nodes in the Ring.

How to migrate single-token cluster to a new vnodes cluster without downtime?

We have Cassandra cluster with single token per node, total 22 nodes, average load per node is 500Gb. It has SimpleStrategy for the main keyspace and SimpleSnitch.
We need to migrate all data to the new datacenter and shutdown the old one without a downtime. New cluster has 28 nodes. I want to have vnodes on it.
I'm thinking of the following process:
Migrate the old cluster to vnodes
Setup the new cluster with vnodes
Add nodes from the new cluster to the old one and wait until it balances everything
Switch clients to the new cluster
Decommission nodes from the old cluster one by one
But there are a lot of technical details. First of all, should I shuffle the old cluster after vnodes migration? Then, what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch? I want to switch to NetworkTopologyStrategy because new cluster has 2 different racks with separate power/network switches.
should I shuffle the old cluster after vnodes migration?
You don't need to. If you go from one token per node to 256 (the default), each node will split its range into 256 adjacent, equally sized ranges. This doesn't affect where data lives. But it means that when you bootstrap in a new node in the new DC it will remain balanced throughout the process.
what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch?
The difficulty is that switching replication strategy is in general not safe since data would need to be moved around the cluster. NetworkToplogyStrategy (NTS) will place data on different nodes if you tell it nodes are in different racks. For this reason, you should move to NTS before adding the new nodes.
Here is a method to do this, after you have upgraded the old cluster to vnodes (your step 1 above):
1a. List all existing nodes as being in DC0 in the properties file. List the new nodes as being in DC1 and their correct racks.
1b. Change the replication strategy to NTS with options DC0:3 (or whatever your current replication factor is) and DC1:0.
Then to add the new nodes, follow the process here: http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-a-data-center-to-a-cluster. Remember to set the number of tokens to 256 since it will be 1 by default.
In step 5, you should set the replication factor for DC0 to be 0 i.e. change replication options to DC0:0, DC1:3. Now those nodes aren't being used so decommission won't stream any data but you should still do it rather than powering them off so they are removed from the ring.
Note one risk is that writes made at a low consistency level to the old nodes could get lost. To guard against this, you could write at CL.LOCAL_QUORUM after you switch to the new DC. There is still a small window where writes could get lost (between steps 3 and 4). If it is important, you can run repair before decommissioning the old nodes to guarantee no losses or write at a high consistency level.
If you are trying to migrate to a new cluster with vnodes, wouldn't you need to change the Partitioner. The documents say that it isn't a good idea to migrate data between different Partitioners.

How to merge two Cassandra cluster into one?

I have two Cassandra clusters, cluster 1 has one keyspace called 'Keyspace1', and cluster 2 has one keyspace called 'Keyspace2'. I want to merge two cluster into one.
How can I do that?
If the cluster in which you are going to hold the data has sufficient space then one of the options could be to stream sstable's data and index files to that cluster using sstable-loader.
For more information see Cassandra Bulk Loader

Resources