How to reuse a cassandra node after decommission? - cassandra

I had a cluster with 2 nodes (node 1 and node 2).
After decommissioning node 2 I wanted to use the server as a fresh Cassandra database for other purposes, but as soon as I restart this message appears:
org.apache.cassandra.exceptions.ConfigurationException: This node was
decommissioned and will not rejoin the ring unless
cassandra.override_decommission=true has been set, or all existing
data is removed and the node is bootstrapped again
So I removed all existing data.
But I don't want the node to be bootstrapped again (neither rejoin the previous ring) but to be a fresh new and pure Cassandra database to be used.
The old node is not on the seed list.
Cassandra version: 3.9
EDIT: I think I was missunderstood, sorry for that. After the decommission I want to have:
Db1: node 1
Db2: node 2
Two diferent databases with no correlation, totally separated. That's because we want to reuse the machine where node2 is hosted again to deploy a Cassandra DB in another enviroment.

Don't use override_decommission. That flag is only used for rejoining the same cluster.
You should remove all data files on the node (Cassandra will recreate system tables on start). Most importantly you need to change the seed in cassandra.yaml. I suspect that it is still the ip of node 1, so you need to change it to node 2 (itself).

Use option
cassandra.override_decommission: true

Use that option, cassandra.override_decommission=true. Also, be aware what is the definition of cluster_name is cassandra.yaml:
The name of the cluster. This setting prevents nodes in one logical
cluster from joining another. All nodes in a cluster must have the
same value.
So, to be sure, also use another value for cluster_name option in cassandra.yaml.
Try these steps:
run in cqlsh: UPDATE system.local SET cluster_name = 'new_name'
where key='local';
nodetool flush in order to persist the data
nodetool decommission
stop node
change name in cassandra.yaml
clean node sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/* but I would just move those file in some other place until you get the state that you want
start node
Please check 1, 2

Related

Cassandra seeds gets added back to cluster after removenode and restart

I have say 2 node Cassandra cluster (for simplification) and i decided to remove one of the node from cluster. Updated the seeds property in cassandra.yaml files of both the nodes.
I logged into node1 and executed below command where Host Id corresponds node2
nodetool removenode
The above command succeeds and i could verify it from below command on Node1
nodetool status
I restart cassandra on Node1 and execute nodetool status again and notice Node2 is added back to cluster.
What is the correct way to remove a cassandra node from cluster ?
Cassandra Version : 2.1.8
Just add below step before starting the node.
Remove the saved_caches folder from your data directory and start the node.
Note:
Removing saved_caches directory will not lead to any data loss. If you remove the complete data directory (or commitLog) you will lose data.
Cassandra manages some caching mechanism and stores those cache information to avoid cold start.
You can get details from below link:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configuring_caches_c.html

Adding new node to Cassandra cluster

I have a 4 node cluster and will be adding an additional node in two days. We aren't using vnodes.
Just wondering the best way to rebalance the cluster after I'm done. Do I just bring the new node up and then start the nodetool move?
Or do I shut each node down, change the initial_token value for each one (using one of those generators to calculate the values for me) and then bring each node up?
I just want to know the simplest way to do this from command line. The new node already has Cassandra installed as it was initially a non-production server, I will delete the data off of the node and change the config files accordingly for the new cluster it will now be a part of, just unsure as to the other steps.
From this page Adding or replacing single-token nodes, the simplest mechanism is to start the new node with it's initial-token left empty in cassandra.yaml. This will make the cluster 'split the token range of the heaviest loaded node and position the new node there'. This won't give you a balanced cluster.
If you want a balanced cluster then you have to go through the nodetool move, node restart, nodetool cleanup procedure you mentioned.

How can the seemingly odd behavior in Cassandra cluster be explained?

I created an Apache Cassandra 2.1.2 cluster of 50 nodes. I named the cluster as "Test Cluster", the default. Then for some testing, I separated one node out of the 50 node cluster. I shut down Cassandra, deleted data dirs, flushed nodetool. Then I edited the single node cluster and called it as "Single Node Test Cluster" I edited seeds, cluster_name and listen_address fields appropriately. I also setup JMX correctly. Now here is what happens.
1. When I run nodetool status on the single node, I see only one node as up and running. If I run nodetool describecluster, I see the new cluster name - "Single Node Test Cluster"
2. When I run nodetool commands on one of the 49 nodes, I see total 50 nodes with the single node as down and I see cluster name as "Test Cluster"
3. There are datastax-agents installed on each node and I had also setup OpsCenter to monitor the cluster. In OpsCenter, I still see 50 nodes as up and cluster name as "Test Cluster"
So my question is why I am seeing these 3 different depictions of same topology and is this expected?
Another issue is, when I start Cassandra on the single node, I still see that it somehow tries to communicate with other nodes and I keep getting cluster name mismatch (Test Cluster != Single Node Test Cluster) WARN on the console before the single node cluster starts.
Is this as expected or is this is bug in Cassandra?
Yes if you remove a node from your cluster you need to inform the restore of the cluster that it is gone.
You do that by decommissioning the node when its still in the cluster or by saying nodetool remove node from another node when the node is gone. I.E. you no longer have access to the box.
If you do neither of the above, you'll still see the node in the other's system.peers table.

Cannot change the number of tokens from 1 to 256

I am using Cassandra 2.0 and cluster has been setup with 3 nodes. Nodetool status and ring showing all the three nodes. I have specified tokens for all the nodes.
I followed the below steps to change the configuration in one node:
1) sudo service cassandra stop
2) updated cassandra.yaml (to update thrift_framed_transport_size_in_mb)
3) sudo srevice cassandra start
The specific not started successfully and system.log shows below exception:
org.apache.cassandra.exceptions.ConfigurationException: Cannot change
the number of tokens from 1 to 256
What is best mechanism to restart the node without losing the existing data in the node or cluster ?
Switching from Non-Vnodes to Vnodes has been a slightly tricky proposition for C* and the mechanism for previously performing this switch (shuffle) is slightly notorious for instability.
The easiest way forward is to start fresh nodes (in a new datacenter) with vnodes enabled and to transfer data to those nodes via repair.
I was also getting this error while I was trying to change the number of tokens from 1 to 256. To solve this I tried the following:
Scenario:
I have 4 node DSE (4.6.1) cassandra cluster. Let say their FQDNs are: d0.cass.org, d1.cass.org, d2.cass.org, d3.cass.org. Here, the nodes d0.cass.org and d1.cass.org are the seed providers. My aim is to enable nodes by changing the num_token attribute in the cassandra.yaml file.
Procedure to be followed for each node (one at a time):
Run nodetool decommission on one node: nodetool decommission
Kil the cassandra process on the decommissioned node. Find the process id for dse cassandra using ps ax | grep dse and kill <pid>
Once the decommissioning of the node is successful, go to one of the remaining nodes and check the status of the cassandra cluster using nodetool status. The decommissioned node should not appear in the list.
Go to one of the active seed_providers and type nodetool rebuild
On the decommissioned node, open the cassandra.yaml file and uncomment the num_tokens: 256. Save and close the file. If this node was originally seed provider, make sure that it's ip-address is removed from the seeds: lists from cassandra.yaml file. If this is not done, the stale information about the cluster topology it has will hinder with the new topology which is being provided by the new seed node. On successful start, it can be added again in the seed list.
Restart the remaining cluster either using the corresponding option in opscenter or manually stopping cassandra on each node and starting it again.
Finally, start cassandra on it using dse cassandra command.
This should work.

Cassandra cluster old data is not replicated in new node

I installed apache cassandra on my local system for testing purpose. With 1 system (1 node) i was able to read/write and query in the database. I added another node and created a cluster. Now the data that I write on my system is replicated on other node and vice versa, but the data which was present on my system earlier to the addition of new node is not replicated. Though the Keyspaces and Tables are present on new node but they are empty. Did I do something wrong while adding the new node to the cluster?
My best guess is that you have auto_bootstrap turned off (it is ON by default). From the documentation:
auto_bootstrap
(Default: true ) This setting has been removed from default configuration. It makes new (non-seed) nodes automatically migrate the right data to themselves. When initializing a fresh cluster with no data, add auto_bootstrap: false.
The easiest way to fix this is to run a repair on the node which will ensure that it gets any data it's missing.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
Simply run this on new node if the data on it is not up-to-date.
Run $CASSANDRA_HOME/bin/nodetool rebuild
Login as cqlsh and verify that new node has received the data.
Also look into below -
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
If you have correct seeds and replication settings. But still the data is not getting replicated to new data center. Please perform nodetool repair with -full option will replicate data to the new datacenter node.

Resources