Which are reasons restart cassandra's cluster?

Which are reasons restart cassandra's cluster? - cassandra

I just have one reason to restart cluster below :
All the nodes have the same hardware configuration
1. When i update file cassandra.yaml
Are there other reasons ?

The thing you are asking for is Rolling Restart a cassandra cluster. There are so many reason to restart a cassandra cluster. I'm just mentioning some below-
when you update any value in cassandra.yaml. (As you mentioned above)
When your nodetool got stucked somehow. such as- you gave command nodetool repair and cancelled the command but it got stucked behind, then you won't be able to give another nodetool repair command.
When you are adding a new node to cluster and you got stream_failed due to nproc limit. That time your running cluster nodes could be down to this issue and going to hold the status.
When you don't want to use sstableloader and you need to restore your data from snapshots. That time you need to provide your snapshots to the data_directory on each node and rolling restart.
When you are about to upgrade your cassandra_version.

For example when you upgrading Cassandra version.

Related

Does cassandra upgrade require to run nodetool upgradesstables for cluster holding TTLed data

I am running 3 node apache cassandra cluster as docker container holding timeseries data with 45 days TTL.
I am planning to upgrade the current cassandra version 2.2.5 to cassandra 3.11.4 release. Following steps are identified for the upgrade -
Backup existing data
Flush one of the cassandra node
bin/nodetool -h cassandra1 -u ca_itoa -pw ca_itoa drain
Stop the cassandra1 node
Start the new cassandra 3.11.4 container
Upgrade the SSTable
bin/nodetool -u ca_itoa -pw ca_itoa upgradesstables
Check the node status. Repeat the process for the rest of the nodes
I have few questions about the upgrade process -
Are the steps correct?
Is it manodatory to run upgradesstables command. It is time consuming, and I want to see if I can avoid. The data has TTL set. Will the cassandra continue writing in new SSTable format whereas the old SSTable data get cleaned-up on expiring? Assumption is that, after 45 days, all SSTable would be in new shiny format.

Just some additional thoughts:
For Step #6, you actually don't have to run upgradesstables right away. In fact, if you're upgrading a production system, it's probably better that you don't until the application team verifies that they can connect ok. Remember, older versions of the driver which work in 2.2 may not work with 3.11.4.
To this end, I would wait until the entire cluster is running on the new version before running upgradesstables on each node.
Is it manodatory to run upgradesstables command?
As each Cassandra version is capable of reading its own SSTable format as well as the prior major version, I guess it's not mandatory. But it's definitely something that you should want to do. Especially when upgrading to 3.x.
Cassandra 3 contains a significant upgrade to the storage engine, which results is a much smaller disk footprint. One cluster I upgraded saw a 90% reduction in disk needs.
Plus, you'd be incurring additional latency when reading records which may be spread across the old SSTable files as well as the new. Reads for records across multiple files are bad enough as it is. But now you'd be forcing Cassandra to read and collate results from two formats.
So while I wouldn't say it's "mandatory," I'd definitely say it qualifies as a "good idea."

Yes, you need to run nodetool sstableupgrade on each node after cassandra upgrade as you are upgrading from 2.2.x to 3.11.4. sstable file format and ext also will change. You may run this process on background and it will not create any issue. please refer below links for more details https://blog.thethings.io/upgrading-apache-cassandra-cluster/

Nodetool rebuild not working reliably on Cassandra 3.11.3

I presently have a cassandra 3.11.3 cluster with a single DC. I recently added another dc to my cluster. And I followed the instructions #
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
As per the instructions I ran 'nodetool rebuild -ks -- dc1 on each of the nodes. However this rebuild command did not actually work as intended. My data is partially missing in the new nodes. This I know because I sampled the data in the new dc through my app using consistency local_one. I dont see the data replenish through read repair either. Oh and I should mention that there were no errors in the logs following the rebuild command. So everything appeared to have succeeded.
What am I missing here? Is there a known issue reported on this?

You should run nodetool rebuild --<existing DC> on each node. this command will pull all keyspace data from existing data center based on allocated tokens and RF. To ensure consistency please run full repair on nodes as well.

How to update configuration of a Cassandra cluster

I have a 3 node Cassandra cluster and I want to make some adjustments to the cassandra.yaml
My question is, how should I perform this? One node at a time or is there a way to make it happen without shutting down nodes?
Btw, I am using Cassandra 2.2 and this is a production cluster.

There are multiple approaches here:
If you edit the cassandra.yaml file, you need to restart cassandra to re-read the contents of that file. If you restart all nodes at once, your cluster will be unavailable. Restarting one node at a time is almost always safe (provided you have sane replication-factors and consistency-levels). If your cluster is configured to survive a rack or datacenter outage, then you can safely restart more nodes concurrently.
Many settings can be changed without a restart via JMX, though I don't have a documentation link handy. Changing via JMX WON'T change cassandra.yml though, so you'll need to update that also or your config will revert back to what's in the file when the node restarts.
If you're using DSE, OpsCenter's Lifecycle Manager feature makes updating configs a simple point-and-click affair (disclaimer, I'm biased as I'm an LCM dev).

Unable to start DSE using SPARK_ENABLED=1

We are running 6 node cluster with:
HADOOP_ENABLED=0
SOLR_ENABLED=0
SPARK_ENABLED=0
CFS_ENABLED=0
Now, we would like to add Spark to all of them. It seems like "adding" is not the right term because this would not fail. Anyways, the steps we've done:
1. drained one of the nodes
2. changed /etc/default/dse to SPARK_ENABLED=1 and HADOOP_ENABLED=0
3. sudo service dse restart
And got the following in the log:
ERROR [main] 2016-05-17 11:51:12,739 CassandraDaemon.java:294 - Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics) differs from previous data center (Cassandra). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
There are two related questions that have been already answered:
Unable to start solr aspect of DSE search
Two node DSE spark cluster error setting up second node. Why?
Unfortunately, clearing the data on the node is not an option - why would I do that? I need the data to be intact.
Using "-Dcassandra.ignore_rack=true -Dcassandra.ignore_dc=true" is a bit scary in production. I don't understand why DSE wants to create another DC and why can't it just use the existing one?
I know that according to datastax's doc one should partition the load using different DC for different workloads. In our case we just want to run SPARK jobs on the same nodes that Cassandra is running using the same DC.
Is that possible?
Thanks!

The other answers are correct. The issue here is trying to warn you that you have previously identified this node as being in another DC. This means that it probably doesn't have the right data for any key-spaces with Network Topology Strategy. For example if you had a NTS keyspace which had only one replica in "Cassandra" and changed the DC to "Analytics" you could inadvertently lose all of the data.
This warning and the accompanying flag are telling you that you are doing something that you should not be doing in a production cluster.
The real solution to this is to explicitly name your dc's using GossipingFileSnitch and not rely on SimpleSnitch which names based on the DSE workload.
In this case, switch to GPFS and set the DC name to Cassandra.

Changing Cluster Name in Cassandra

I have a cluster with 2 machines (centos7 and cassandra 3.4), 192.168.0.175 and 192.168.0.174. The seed is the 192.168.0.175.
I simply want to change the cluster name. Peace of cake should be.
I did on each cluster :
update system.local set cluster_name = 'America2' where key='local';
i did the nodetool flush
i updated the cassandra.yaml with the new name
restarted cassandra.
When i cqlsh any if describes me as connected to new cluster_name America2
When i run nodetool describecluster it shows the old cluster name America
If i stop cassandra on both machines and i try to restart them i find in logs the good old error :
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name America != configured name America2
So....what am i doing wrong!?

before changing cluster name
delete node from cluster ring
nodetool decommission
stop node and change cluster name in cassandra.yaml
clean node
sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/*
start cassandra node
More information you can find at academy.datastax.com

Ok guys, what i did :
cqlsh with each machine and :
update system.local set cluster_name = 'canada' where key='local' ;
then
$ nodetool flush -- system
then i stoped the service on both machines.
modified the cassandra.yaml with new cluster name canada.
started back the machines, they were working with new cluster name.
It is possible to do those steps without stoping all machines in the cluster, taking them out one by one( i think a repair on each node might be neccessary after ). Consider changing the seeds first.

It's not really possible. I had the same problem. I solved this in a really dirty way. I wrote a script where i got all my column family data. Simply: A backup. Then i done this on each node: i stopped cassandra, i dropped all cassandras data, cache etc. (You can also reinstall cassandra.) i created a new cluster imported my backup. And i done this for each node.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string