Cassandra add nodes and modify snitch in existing cluster - cassandra

Currently we have a 3 node Cassandra cluster with SimpleSnitch and keyspaces with SimpleStrategy and RF=1
and now, we want to add 3 nodes in our cluster, and they located in the same DC but on different Physical racks. this nodes are presented and cassandra successfully installed on it, but I'm never start them
so my plan is
change configs in /etc/cassandra/conf/cassandra.yaml add/change the following
auto_bootstrap: true
endpoint_snitch: GossipingPropertyFileSnitch
in /etc/cassandra/cassandra-rackdc dc=datacenter1 and rack=rack1{2,3}
then restart old cluster (one by one) using nodetool drain && systemctl restart cassandra
start new 3 nodes with right settings (GossipingPropertyFileSnitch and right rack)
check the nodetool describecluster and nodetool status
ALTER user and system keyspaces (system_auth, system_traces, and system_distributed) change the Strategy and RF
nodetool repair -pr on every node in the cluster
nodetool cleanup on every node (except the last node)
is my plan correct? i've played with my dev cassandra cluster and all is ok

Related

Do we need to run the nodetool repair on every node in the cluster ?

is nodetool repair command need to be run on the all cluster nodes ?
i understand that this command repair the replica on a node with the other replicas and we need to run it on all node to get high consistency.
The "nodetool repair" on single node only triggers a repair on its range of tokens with other nodes in cluster. You need it run in every node sequentially, for the complete data in cluster to be repaired.
Also a good alternative/recommendation is to use "nodetool repair -pr". The "-pr" option indicates that only a primary-range of tokens in a given node is repaired. But again this needs to be run on every node in every DC of the cluster.
The repair command only repairs token ranges on the node being
repaired, it doesn't repair the whole cluster.By default, repair will operate on all token ranges replicated by the node you’re running repair on, which will cause duplicate work if you run it on every node. The -pr flag will only repair the “primary” ranges on a node, so you can repair your entire cluster by running nodetool repair -pr on each node in a single datacenter. Reference

Cassandra seeds gets added back to cluster after removenode and restart

I have say 2 node Cassandra cluster (for simplification) and i decided to remove one of the node from cluster. Updated the seeds property in cassandra.yaml files of both the nodes.
I logged into node1 and executed below command where Host Id corresponds node2
nodetool removenode
The above command succeeds and i could verify it from below command on Node1
nodetool status
I restart cassandra on Node1 and execute nodetool status again and notice Node2 is added back to cluster.
What is the correct way to remove a cassandra node from cluster ?
Cassandra Version : 2.1.8
Just add below step before starting the node.
Remove the saved_caches folder from your data directory and start the node.
Note:
Removing saved_caches directory will not lead to any data loss. If you remove the complete data directory (or commitLog) you will lose data.
Cassandra manages some caching mechanism and stores those cache information to avoid cold start.
You can get details from below link:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configuring_caches_c.html

How can the seemingly odd behavior in Cassandra cluster be explained?

I created an Apache Cassandra 2.1.2 cluster of 50 nodes. I named the cluster as "Test Cluster", the default. Then for some testing, I separated one node out of the 50 node cluster. I shut down Cassandra, deleted data dirs, flushed nodetool. Then I edited the single node cluster and called it as "Single Node Test Cluster" I edited seeds, cluster_name and listen_address fields appropriately. I also setup JMX correctly. Now here is what happens.
1. When I run nodetool status on the single node, I see only one node as up and running. If I run nodetool describecluster, I see the new cluster name - "Single Node Test Cluster"
2. When I run nodetool commands on one of the 49 nodes, I see total 50 nodes with the single node as down and I see cluster name as "Test Cluster"
3. There are datastax-agents installed on each node and I had also setup OpsCenter to monitor the cluster. In OpsCenter, I still see 50 nodes as up and cluster name as "Test Cluster"
So my question is why I am seeing these 3 different depictions of same topology and is this expected?
Another issue is, when I start Cassandra on the single node, I still see that it somehow tries to communicate with other nodes and I keep getting cluster name mismatch (Test Cluster != Single Node Test Cluster) WARN on the console before the single node cluster starts.
Is this as expected or is this is bug in Cassandra?
Yes if you remove a node from your cluster you need to inform the restore of the cluster that it is gone.
You do that by decommissioning the node when its still in the cluster or by saying nodetool remove node from another node when the node is gone. I.E. you no longer have access to the box.
If you do neither of the above, you'll still see the node in the other's system.peers table.

Cannot change the number of tokens from 1 to 256

I am using Cassandra 2.0 and cluster has been setup with 3 nodes. Nodetool status and ring showing all the three nodes. I have specified tokens for all the nodes.
I followed the below steps to change the configuration in one node:
1) sudo service cassandra stop
2) updated cassandra.yaml (to update thrift_framed_transport_size_in_mb)
3) sudo srevice cassandra start
The specific not started successfully and system.log shows below exception:
org.apache.cassandra.exceptions.ConfigurationException: Cannot change
the number of tokens from 1 to 256
What is best mechanism to restart the node without losing the existing data in the node or cluster ?
Switching from Non-Vnodes to Vnodes has been a slightly tricky proposition for C* and the mechanism for previously performing this switch (shuffle) is slightly notorious for instability.
The easiest way forward is to start fresh nodes (in a new datacenter) with vnodes enabled and to transfer data to those nodes via repair.
I was also getting this error while I was trying to change the number of tokens from 1 to 256. To solve this I tried the following:
Scenario:
I have 4 node DSE (4.6.1) cassandra cluster. Let say their FQDNs are: d0.cass.org, d1.cass.org, d2.cass.org, d3.cass.org. Here, the nodes d0.cass.org and d1.cass.org are the seed providers. My aim is to enable nodes by changing the num_token attribute in the cassandra.yaml file.
Procedure to be followed for each node (one at a time):
Run nodetool decommission on one node: nodetool decommission
Kil the cassandra process on the decommissioned node. Find the process id for dse cassandra using ps ax | grep dse and kill <pid>
Once the decommissioning of the node is successful, go to one of the remaining nodes and check the status of the cassandra cluster using nodetool status. The decommissioned node should not appear in the list.
Go to one of the active seed_providers and type nodetool rebuild
On the decommissioned node, open the cassandra.yaml file and uncomment the num_tokens: 256. Save and close the file. If this node was originally seed provider, make sure that it's ip-address is removed from the seeds: lists from cassandra.yaml file. If this is not done, the stale information about the cluster topology it has will hinder with the new topology which is being provided by the new seed node. On successful start, it can be added again in the seed list.
Restart the remaining cluster either using the corresponding option in opscenter or manually stopping cassandra on each node and starting it again.
Finally, start cassandra on it using dse cassandra command.
This should work.

Adding a new node to existing node cassandra cluster

Starting from one host running Cassandra, I am trying to add a new node and form a cluster.
I update the seeds list on both hosts and after restarting both nodes, I do nodetool status and see both nodes forming a cluster. However, I am seeing some data loss issue. I am not seeing all the data that I added to a column family before I added the new node.
Steps to reproduce:
Start a node with following settings in cassandra.yaml
initial_token:
num_tokens:256
seed_list: host1
Create a keyspace and a column family and enter some data
Start another node, exact same settings and host1 with the following settings changes on both - seeds: host1, host2
When I log in to cal from host2, I do not see all data.
Running:
nodetool cleanup
nodetool repair
nodetool rebuild
should solve the issue.
Will suggest you to run a nodetool cleanup in both the nodes so that keys get distributed.

Resources