Adding a new node to existing node cassandra cluster

Adding a new node to existing node cassandra cluster - cassandra

Starting from one host running Cassandra, I am trying to add a new node and form a cluster.
I update the seeds list on both hosts and after restarting both nodes, I do nodetool status and see both nodes forming a cluster. However, I am seeing some data loss issue. I am not seeing all the data that I added to a column family before I added the new node.
Steps to reproduce:
Start a node with following settings in cassandra.yaml
initial_token:
num_tokens:256
seed_list: host1
Create a keyspace and a column family and enter some data
Start another node, exact same settings and host1 with the following settings changes on both - seeds: host1, host2
When I log in to cal from host2, I do not see all data.

Running:
nodetool cleanup
nodetool repair
nodetool rebuild
should solve the issue.

Will suggest you to run a nodetool cleanup in both the nodes so that keys get distributed.

Related

How to reuse a cassandra node after decommission?

I had a cluster with 2 nodes (node 1 and node 2).
After decommissioning node 2 I wanted to use the server as a fresh Cassandra database for other purposes, but as soon as I restart this message appears:
org.apache.cassandra.exceptions.ConfigurationException: This node was
decommissioned and will not rejoin the ring unless
cassandra.override_decommission=true has been set, or all existing
data is removed and the node is bootstrapped again
So I removed all existing data.
But I don't want the node to be bootstrapped again (neither rejoin the previous ring) but to be a fresh new and pure Cassandra database to be used.
The old node is not on the seed list.
Cassandra version: 3.9
EDIT: I think I was missunderstood, sorry for that. After the decommission I want to have:
Db1: node 1
Db2: node 2
Two diferent databases with no correlation, totally separated. That's because we want to reuse the machine where node2 is hosted again to deploy a Cassandra DB in another enviroment.

Don't use override_decommission. That flag is only used for rejoining the same cluster.
You should remove all data files on the node (Cassandra will recreate system tables on start). Most importantly you need to change the seed in cassandra.yaml. I suspect that it is still the ip of node 1, so you need to change it to node 2 (itself).

Use option
cassandra.override_decommission: true

Use that option, cassandra.override_decommission=true. Also, be aware what is the definition of cluster_name is cassandra.yaml:
The name of the cluster. This setting prevents nodes in one logical
cluster from joining another. All nodes in a cluster must have the
same value.
So, to be sure, also use another value for cluster_name option in cassandra.yaml.
Try these steps:
run in cqlsh: UPDATE system.local SET cluster_name = 'new_name'
where key='local';
nodetool flush in order to persist the data
nodetool decommission
stop node
change name in cassandra.yaml
clean node sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/* but I would just move those file in some other place until you get the state that you want
start node
Please check 1, 2

Cassandra seeds gets added back to cluster after removenode and restart

I have say 2 node Cassandra cluster (for simplification) and i decided to remove one of the node from cluster. Updated the seeds property in cassandra.yaml files of both the nodes.
I logged into node1 and executed below command where Host Id corresponds node2
nodetool removenode
The above command succeeds and i could verify it from below command on Node1
nodetool status
I restart cassandra on Node1 and execute nodetool status again and notice Node2 is added back to cluster.
What is the correct way to remove a cassandra node from cluster ?
Cassandra Version : 2.1.8

Just add below step before starting the node.
Remove the saved_caches folder from your data directory and start the node.
Note:
Removing saved_caches directory will not lead to any data loss. If you remove the complete data directory (or commitLog) you will lose data.
Cassandra manages some caching mechanism and stores those cache information to avoid cold start.
You can get details from below link:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configuring_caches_c.html

Changing Cluster Name in Cassandra

I have a cluster with 2 machines (centos7 and cassandra 3.4), 192.168.0.175 and 192.168.0.174. The seed is the 192.168.0.175.
I simply want to change the cluster name. Peace of cake should be.
I did on each cluster :
update system.local set cluster_name = 'America2' where key='local';
i did the nodetool flush
i updated the cassandra.yaml with the new name
restarted cassandra.
When i cqlsh any if describes me as connected to new cluster_name America2
When i run nodetool describecluster it shows the old cluster name America
If i stop cassandra on both machines and i try to restart them i find in logs the good old error :
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name America != configured name America2
So....what am i doing wrong!?

before changing cluster name
delete node from cluster ring
nodetool decommission
stop node and change cluster name in cassandra.yaml
clean node
sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/*
start cassandra node
More information you can find at academy.datastax.com

Ok guys, what i did :
cqlsh with each machine and :
update system.local set cluster_name = 'canada' where key='local' ;
then
$ nodetool flush -- system
then i stoped the service on both machines.
modified the cassandra.yaml with new cluster name canada.
started back the machines, they were working with new cluster name.
It is possible to do those steps without stoping all machines in the cluster, taking them out one by one( i think a repair on each node might be neccessary after ). Consider changing the seeds first.

It's not really possible. I had the same problem. I solved this in a really dirty way. I wrote a script where i got all my column family data. Simply: A backup. Then i done this on each node: i stopped cassandra, i dropped all cassandras data, cache etc. (You can also reinstall cassandra.) i created a new cluster imported my backup. And i done this for each node.

Cannot change the number of tokens from 1 to 256

I am using Cassandra 2.0 and cluster has been setup with 3 nodes. Nodetool status and ring showing all the three nodes. I have specified tokens for all the nodes.
I followed the below steps to change the configuration in one node:
1) sudo service cassandra stop
2) updated cassandra.yaml (to update thrift_framed_transport_size_in_mb)
3) sudo srevice cassandra start
The specific not started successfully and system.log shows below exception:
org.apache.cassandra.exceptions.ConfigurationException: Cannot change
the number of tokens from 1 to 256
What is best mechanism to restart the node without losing the existing data in the node or cluster ?

Switching from Non-Vnodes to Vnodes has been a slightly tricky proposition for C* and the mechanism for previously performing this switch (shuffle) is slightly notorious for instability.
The easiest way forward is to start fresh nodes (in a new datacenter) with vnodes enabled and to transfer data to those nodes via repair.

I was also getting this error while I was trying to change the number of tokens from 1 to 256. To solve this I tried the following:
Scenario:
I have 4 node DSE (4.6.1) cassandra cluster. Let say their FQDNs are: d0.cass.org, d1.cass.org, d2.cass.org, d3.cass.org. Here, the nodes d0.cass.org and d1.cass.org are the seed providers. My aim is to enable nodes by changing the num_token attribute in the cassandra.yaml file.
Procedure to be followed for each node (one at a time):
Run nodetool decommission on one node: nodetool decommission
Kil the cassandra process on the decommissioned node. Find the process id for dse cassandra using ps ax | grep dse and kill <pid>
Once the decommissioning of the node is successful, go to one of the remaining nodes and check the status of the cassandra cluster using nodetool status. The decommissioned node should not appear in the list.
Go to one of the active seed_providers and type nodetool rebuild
On the decommissioned node, open the cassandra.yaml file and uncomment the num_tokens: 256. Save and close the file. If this node was originally seed provider, make sure that it's ip-address is removed from the seeds: lists from cassandra.yaml file. If this is not done, the stale information about the cluster topology it has will hinder with the new topology which is being provided by the new seed node. On successful start, it can be added again in the seed list.
Restart the remaining cluster either using the corresponding option in opscenter or manually stopping cassandra on each node and starting it again.
Finally, start cassandra on it using dse cassandra command.
This should work.

Showing old cluster ips when i run nodetool status for new cluster

I already have a cassandra cluster of 4 nodes (version 1.2.5). I made an image of one of the cassandra instance in the existing cluster and created new cluster with 2 nodes. When i started new instances and checked nodetool status it is showing the old cluster ips also. Checked the cassandra.yaml file. Seeds are set for new nodes only. What else needs to be changed?

I think the cluster metainfo is stored in the system keyspace--check the peers table.

#Alex Popescu , Thanks that worked for me.
1)SELECT * FROM system.peers; - gave me the entries for the hosts .
2)REMOVED the entry for unwanted node.
3)Ran nodetool flush system.
4)Restarted cassandra on all nodes of the cluster.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string