I have a 4 node cluster and will be adding an additional node in two days. We aren't using vnodes.
Just wondering the best way to rebalance the cluster after I'm done. Do I just bring the new node up and then start the nodetool move?
Or do I shut each node down, change the initial_token value for each one (using one of those generators to calculate the values for me) and then bring each node up?
I just want to know the simplest way to do this from command line. The new node already has Cassandra installed as it was initially a non-production server, I will delete the data off of the node and change the config files accordingly for the new cluster it will now be a part of, just unsure as to the other steps.
From this page Adding or replacing single-token nodes, the simplest mechanism is to start the new node with it's initial-token left empty in cassandra.yaml. This will make the cluster 'split the token range of the heaviest loaded node and position the new node there'. This won't give you a balanced cluster.
If you want a balanced cluster then you have to go through the nodetool move, node restart, nodetool cleanup procedure you mentioned.
Related
I was trying to change one of my node to a machine with high specs (more memory and CPU). So I run node decommission wait to leave the ring, terminate the machine and add a new machine. After that I configured the cassandra.yaml with
cluster_name
listend_address
rpc_broadcast_address
-seeds: with the ip's machine
After starting the cassandra service, the seed node joined the ring right way and with a low load. Which for me is really strange, since others node took a lot of time to join the ring.
After 1h the seed node is still with the same load.
What should I do to add the seed node?
Thanks in advance.
Yes, many newer users of Cassandra have this idea that a seed node is some mystical master-node equivalent. It's really not anything special. Essentially, a node needs to know about the cluster topology at start-time, and the seeds property provides a list of nodes that should be there.
In theory, you can have a new node designate any existing node as its seed node. And that node could designate another node as its seed node. And so on, and so on. All it does is use that node to figure out the cluster topology.
After starting the cassandra service, the seed node joined the ring right way and with a low load. Which for me is really strange, since others node took a lot of time to join the ring.
After 1h the seed node is still with the same load.
What should I do to add the seed node?
Seed nodes do not stream data. The extra steps required to get data on to a seed node, are one of the main reasons that it's not a good idea to designate all nodes as seed nodes.
You could just run a nodetool repair or rebuild on the new seed node, and that would stream data to it. The problem with this approach, is that it will still be accepting requests (and probably failing) while it is streaming data.
The other approach, would be to add the new node while specifying other existing node(s) in that node's seeds list.
Once it is up and has streamed data, then you have a couple of options:
Leave everything as-is, and any future nodes can use your new node in its seed list.
If your other existing nodes have a node(s) in their seed list that don't exist anymore, you can update those to point to your new node as a seed.
The nice part about option #2, is that you can change that in the cassandra.yaml and not have to restart them. This is because the only time you'll ever need that change is when a restart happens anyway. The seed node designation don't come into play during normal operations.
Hope that helps!
I had a cluster with 2 nodes (node 1 and node 2).
After decommissioning node 2 I wanted to use the server as a fresh Cassandra database for other purposes, but as soon as I restart this message appears:
org.apache.cassandra.exceptions.ConfigurationException: This node was
decommissioned and will not rejoin the ring unless
cassandra.override_decommission=true has been set, or all existing
data is removed and the node is bootstrapped again
So I removed all existing data.
But I don't want the node to be bootstrapped again (neither rejoin the previous ring) but to be a fresh new and pure Cassandra database to be used.
The old node is not on the seed list.
Cassandra version: 3.9
EDIT: I think I was missunderstood, sorry for that. After the decommission I want to have:
Db1: node 1
Db2: node 2
Two diferent databases with no correlation, totally separated. That's because we want to reuse the machine where node2 is hosted again to deploy a Cassandra DB in another enviroment.
Don't use override_decommission. That flag is only used for rejoining the same cluster.
You should remove all data files on the node (Cassandra will recreate system tables on start). Most importantly you need to change the seed in cassandra.yaml. I suspect that it is still the ip of node 1, so you need to change it to node 2 (itself).
Use option
cassandra.override_decommission: true
Use that option, cassandra.override_decommission=true. Also, be aware what is the definition of cluster_name is cassandra.yaml:
The name of the cluster. This setting prevents nodes in one logical
cluster from joining another. All nodes in a cluster must have the
same value.
So, to be sure, also use another value for cluster_name option in cassandra.yaml.
Try these steps:
run in cqlsh: UPDATE system.local SET cluster_name = 'new_name'
where key='local';
nodetool flush in order to persist the data
nodetool decommission
stop node
change name in cassandra.yaml
clean node sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/* but I would just move those file in some other place until you get the state that you want
start node
Please check 1, 2
I created an AMI from my cassandra machine and then launched a new instance. After making config changes(setting the seed node to the first one, and setting auto_bootstrap: false) when I start cassandra and do a nodetool status it shows data on the both the nodes. I just want to know if the cluster actually knows that both nodes have the data and if a request comes can route it to the second node also.
As without manually copying data, the streaming is actually not getting completed. It somehow manages to fail after a certain period of time and then I have to again run 'nodetool bootstrap resume' to restart bootstraping process which again fails.
I don't think this should work this way (all the copying thing).
Why you can't perform normal bootstrapping? What are error messages in the logs when you try to do it? What is RF of your keyspace?
In addition to your data, Cassandra also saves information about the node on disk, all the system tables, for example node id, so you can't just replicate the image. If you copied cassandra image, and just changed config, this wouldn't work, you should delete all data prior to starting the node and joining to cluster.
EDIT:
If you going with auto_bootstrap: off
Remove all the data from the new server (both data and commit log directories).
Start the node, and after it joins, run rebuild.
Run repair after the process is finished.
If you going with auto_bootstrap: on
Remove all the data from the new server (both data and commit log directories).
Start the node and monitor the bootstraping.
Before trying these, remove the node you can't add from the cluster.
I added a new node to Cassandra cluster by making it seed node and than started rebuilding it (nodetool rebuild ) command. Although node joined the cluster quickly, the rebuild process that started streaming from all nodes in selected caused the whole dc nodes to slow down. The impact on application is severe. I'll have to stop rebuild process in order to keep normal operation ON!.
Here, I'm seeking advice if you guys can share ways/tricks so to minimize the impact of (node rebuild ) operation on rest of dc nodes and application.
I'll much appreciate your suggestions - thanks for reading my message and your help in advance.
While adding a new node you shouldn't make it a seed node. The seed node is used to bootstrap other nodes and join them in cluster. Making the new node as a seed node will not allow to join the new node in the cluster. Follow the steps provided in the Cassandra docs provided in the link below.
https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
This is the best way to add a new node in the cluster.
Note: Make sure the new node is not listed in the -seeds list. Do not make all nodes seed nodes. Please read Internode communications (gossip).
As I understand, you added a node as a seed node just so it will not bootstrap and join the cluster instantly. While is approach is right for it joins the cluster quickly, the downside is, it will not bootstrap and hence will not copy all the data from other nodes that it is responsible for. When you run a rebuild on that node, data is blindly copied (without doing any validation) from other nodes, which can choke up existing nodes throughput and your network pipeline. This approach is very safe and is recommended when using adding a new DC but not when adding nodes to an existing DC.
When you are adding a node, the simplest way is to add a node using the procedure described here. https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
When the node bootstrap, it will copy data needed from other nodes but will not start taking client connections until it fully bootstraps and validates the data. So, add one node at a time and let it bootstrap so all the necessary data is copied and validated. After you are done adding the number of nodes you desire, run a cleanup on all the previously joined nodes to cleanup all the keys for which the old nodes are not responsible.
Is it possible to add a new node to an existing cluster in cassandra 1.2 without running nodetool cleanup on each individual node once data has been added?
It probably isn't but I need to ask because I'm trying to create an application where each user's machine is a server allowing for endless scaling.
Any advice would be appreciated.
Yes, it is possible. But you should be aware of the side-effects of not doing so.
nodetool cleanup purges keys that are no longer allocated to that node. According to the Apache docs, these keys count against the allocated data for that node, which can cause the auto bootstrap process for the next node to not properly balance the ring. So depending on how you are bringing new user machines into the ring, this may or may not be a problem.
Also keep in mind that nodetool cleanup only needs to be run on nodes that lost keyspace to the new node - i.e. adjacent nodes, not all nodes, in the cluster.