When to enable repairs after adding new nodes to cassandra cluster? - cassandra

I am adding new nodes to my existing cassandra cluster which is running on Vnodes with cassandra version 2.1.16. I had cron jobs scheduled for the repairs to run on these nodes. Before adding the new nodes I had disabled the cron jobs, but I am confused whether I should enable the repairs after both token moves and cleanups are completed or can I enable it after token moves before cleanups?

You can enable your repair jobs after you do the cleanup. I suggest reading this article, especially the Gotchas section for the Range movement. If the scenario described there applies to you, then you would need to run repair manually on the node, after bootstrapping.

Once your node added in your existing cluster and new node showing status UN. you can run repair and after this you can run cleanup as well.

Related

What is the Correct order to restart a cluster for point-in-time restore?

I have a mixed workload cluster across multiple datacenters. I have ran the sstableloader command for the tables I want to restore using snapshots which I had backed up. I have added commit log files which I had backed up from archive to a restore directory on all nodes. I have updated the commitlog_archiving.properties file with these configs.
What is the correct way and order to restart nodes of my cluster?
Do these considerations apply for restarting as well?
As a general rule, we recommend restarting seed nodes in the DC first before other nodes so gossip propagation happens faster particularly for larger clusters (arbitrarily 15+ nodes). It is important to note that a restart is not required if you restored data using sstableloader.
If you are just performing a rolling restart then the order of the DCs does not matter. But it matters if you are starting up a cluster from a cold shutdown meaning all nodes are down and the cluster is completely offline.
When starting from a cold shutdown, it is important to start with the "Analytics DC" (nodes running in Analytics mode, i.e with Spark enabled) because it makes it easier to elect a Spark master. Assuming that the replication for Analytics keyspaces are configured with the recommended replication factor of 3, you will need to start 2 or 3 nodes beginning with the seeds ideally 1 minute apart because the LeaderManager requires a quorum of nodes to elect a Spark master.
We recommend leaving DCs with nodes running in Search mode (with Solr enabled) last as a matter of convenience so that all the other DCs are operational before the cluster starts accepting Search requests from the application(s). Cheers!
If you've done all of that, I don't think the order matters too much. Although, you should restart your seed nodes first, that way the nodes in the cluster have a common cluster entrypoint to find their way back in and correctly rejoin.

Cassandra reaper - should I repair also reapers database?

So I have installed cassandra-reaper, and I have setup schedules for every Wednesday to repair my projects db. I'm just wondering if there is any need to schedule also a repair for the cassandra-reapers database, which was created?
I think, No because Reaper is just UI to schedule and manage Cassandra cluster.
It improves the existing nodetool repair process by
Splitting repair jobs into smaller tunable segments.
Handling back-pressure through monitoring running repairs and pending compaction.
Adding ability to pause or cancel repairs and track progress precisely.
Reaper ships with a REST API, a command line tool and a web UI.

Cassandra DSC repair after very long time

I have a small cassandra DSC 2.2.8 cluster with 4 nodes that is for a long time now in service (more than 6 months). I have never run repair again and I am afraid that there may be deleted data resurrected. Is now too late for a repair? If I run nodetool repair the default is parallel mode, do I still need to run it in all 4 nodes one by one?
Nodetool Repair is a good way to optimize your node. Also improves the performance of the node. This will not resurrect the deleted data, in fact, will perform compaction(that will keep the latest record in database). You can perform repair on a DC as well as individual node.

Changing Snitch on live cluster in datastax 4.5

I have 8 nodes in one region and now i want to add new node in other region.Presently i m using ec2snitch ,after adding node to new region i need to change snitchs of all nodes to ec2 multiregion snitch.
Now my question is, does this change will impact my current running cluster? and what would be the best practice for doing this .
Thanks
You should do a rolling restart changing to ec2 multi region snitch before adding the new node. It should not impact your running cluster. Though I would suggest you bring up a test cluster briefly to test making the change.
To perform a rolling restart from Opscenter:
Click Nodes in the left pane.
In the contextual menu select Restart
from the Cluster Actions dropdown.
Set the amount of time to wait after restarting each node, select whether the node should be
drained before stopping, and then click Restart Cluster.
See more details here:
http://www.datastax.com/documentation/opscenter/5.0/opsc/online_help/opscRestartingCluster_t.html
Here is a link to the DataStax documentation for switching snitches. I found that to be useful when I switched to the GossipingPropertiesFileSnitch. I also had to edit cassandra-rackdc.properties on all nodes before doing the rolling restart.
Even though my topology didn't change, I followed the instruction in the reference. Stopped all the nodes, restarted them (start with the seeds), then ran 'nodetool repair' and 'nodetool cleanup' on all nodes.

Adding a new node to existing cluster

Is it possible to add a new node to an existing cluster in cassandra 1.2 without running nodetool cleanup on each individual node once data has been added?
It probably isn't but I need to ask because I'm trying to create an application where each user's machine is a server allowing for endless scaling.
Any advice would be appreciated.
Yes, it is possible. But you should be aware of the side-effects of not doing so.
nodetool cleanup purges keys that are no longer allocated to that node. According to the Apache docs, these keys count against the allocated data for that node, which can cause the auto bootstrap process for the next node to not properly balance the ring. So depending on how you are bringing new user machines into the ring, this may or may not be a problem.
Also keep in mind that nodetool cleanup only needs to be run on nodes that lost keyspace to the new node - i.e. adjacent nodes, not all nodes, in the cluster.

Resources