Schema version mismatch with 3.0.8 and 3.0.14

Schema version mismatch with 3.0.8 and 3.0.14 - cassandra

I am trying to upgrade the cassandra version from the 3.0.8 to 3.0.14. I am adding a new node with 3.0.14 version to 3.0.8. cluster and I see the schema disagreement between the nodes and the new node doesn't stream any data.
I am looking at : https://issues.apache.org/jira/browse/CASSANDRA-13559, does this mean, I will not be able to add nodes with the higher version than 3.0.13?
here is what I see in the nodetool describecluster output
$ nodetool describecluster
Cluster Information:
Name: production
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
45ad6427-30a8-3381-9e2c-266b446c6ea7: [192.168.1.2, 192.168.1.3, 192.168.1.4]
c2a2bb4f-7d31-3fb8-a216-00b41a643650: [10.10.1.10]
Any work around to mitigate this?

Did you run nodetool upgradesstables?

As far as I know, you can't add nodes of different versions to an existing cluster. You have to upgrade the existing nodes in place using a rolling upgrade. Check out this SO question or this doc which detail the steps to do a rolling upgrade.

This is a tad late but I ran into this previously too.
see https://github.com/apache/cassandra/blob/cassandra-3.0/NEWS.txt#L166 for release notes specific to 3.0.14.
you need a temporary flag: -Dcassandra.force_3_0_protocol_version=true on the 3.0.14 nodes to enable communication between these two versions. There is an gossip incompatibility that causes the schema not to be pulled during bootstrap. You should remove this flag after upgrading the entire cluster and do another rolling restart
I would guess that in the debug logs you would find a line like "shouldPullSchema returned false" due to this incompatibility.

Related

New Cassandra nodes in cluster with different version

I have a Cassandra cluster with 3 nodes running on v3.11.4
I want to add 3 more nodes to this cluster. Now, Cassandra v4 is available so i have installed it on the new nodes.
When i restart Cassandra, the new nodes are unable to join the cluster.
Error: There are nodes in the cluster with a different schema version than us
I even tried added skip_schema options in jvm-server.options file but still the nodes could not join.
Please help me how can i add the new nodes in the existing cluster. I want to keep v4 for new nodes so i don't have to update these when upgrading older nodes to v4.

It isn't possible to add nodes running a new major version to the cluster. You will only be able to add nodes running Cassandra 3.11.
They won't be able to stream data to each other because they have different formats. This is the reason you can't run repairs during an upgrade. You also can't add or decommission nodes in the middle of an upgrade. Cheers!

So the plan forward here, would be to shut down the Cassandra 4.0 nodes. Then upgrade the 3.11 nodes to 4.0. Then adding the new 4.0 nodes should work as expected.

Cassandra upgrade from 2.0.x to 2.1.x or 3.0.x

I've searched for previous versions of this question, but none seem to fit my case. I have an existing Cassandra cluster running 2.0.x. I've been allocated new VMs, so I do NOT want to upgrade my existing Cassandra nodes - rather I want to migrate to a) new VMs and b) a more current version of Cassandra.
I know for in-place upgrades, I would upgrade to the latest 2.0.x, then to the latest 2.1.x. AFAIK, there's no SSTable inconsistency here. If I go this route via addition of new nodes, I assume I would follow the datastax instructions for adding new nodes/decommissioning old nodes?
Given the above, is it possible to move from 2.0.x to 3.0.x? I know the SSTable format is different; however, if I'm adding new nodes (rather than re-using SSTables on disk), does this matter?
It seems to me that #2 has to work - otherwise, it implies that any upgrade requiring SSTable upgrades would require all nodes to be taken offline simultaneously; otherwise, there would be mixed 2.x.x and 3.0.x versions running in the same cluster at some point.
Am I completely wrong? Does anyone have any experience doing this?

Yes, it is possible to migrate data to a different environment (the new vm's with the updated Cassandra using sstableloader, but you will need C* 3.0.5 and above, as that version added support to upload sstables from previous versions.
Once that the process is completed it is recommended to execute nodetool upgradesstables to ensure that there are no incompatibilities on the data, and a nodetool cleanup.
Regarding your comment ... it implies that any upgrade requiring SSTable upgrades would require all nodes to be taken offline simultaneously;... is not true; doing the upgrade one node at a time will create a mixed cluster with nodes with the two versions as you mentioned, which is not optimal, but will allow you to avoid any downtime in production. (Note that the impact of this operation will depend on the consistency level used in your application.)

Don't worry about the migration. You can simply migrate your Cassandra 2.0.X cluster to Cassandra 3.0.X. But its better if you migrate your cluster Cassandra 2.0.X to latest Cassandra 2.X.X then Cassandra 3.0.X. You need to follow some steps-
Backup data
Uninstall present version
Install the version you want to upgrade
Restore data
As you are doing migration, you need to be careful about your data always. For the data backup and restore you can follow two ways-
Creating snapshots of your sstables and then after installing the new version of cassandra, placing the files to the data location and run sstableloader.
Backup your schema's to a .cql file and copy all the tables to .csv and then after installing the new version of cassandra, source your schema from .cql and copy all the tables from every single .csv file.
If you are fully convinced how you will complete the migration then you can write a bash script to complete the backup and restore steps.

Unable to start DSE using SPARK_ENABLED=1

We are running 6 node cluster with:
HADOOP_ENABLED=0
SOLR_ENABLED=0
SPARK_ENABLED=0
CFS_ENABLED=0
Now, we would like to add Spark to all of them. It seems like "adding" is not the right term because this would not fail. Anyways, the steps we've done:
1. drained one of the nodes
2. changed /etc/default/dse to SPARK_ENABLED=1 and HADOOP_ENABLED=0
3. sudo service dse restart
And got the following in the log:
ERROR [main] 2016-05-17 11:51:12,739 CassandraDaemon.java:294 - Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics) differs from previous data center (Cassandra). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
There are two related questions that have been already answered:
Unable to start solr aspect of DSE search
Two node DSE spark cluster error setting up second node. Why?
Unfortunately, clearing the data on the node is not an option - why would I do that? I need the data to be intact.
Using "-Dcassandra.ignore_rack=true -Dcassandra.ignore_dc=true" is a bit scary in production. I don't understand why DSE wants to create another DC and why can't it just use the existing one?
I know that according to datastax's doc one should partition the load using different DC for different workloads. In our case we just want to run SPARK jobs on the same nodes that Cassandra is running using the same DC.
Is that possible?
Thanks!

The other answers are correct. The issue here is trying to warn you that you have previously identified this node as being in another DC. This means that it probably doesn't have the right data for any key-spaces with Network Topology Strategy. For example if you had a NTS keyspace which had only one replica in "Cassandra" and changed the DC to "Analytics" you could inadvertently lose all of the data.
This warning and the accompanying flag are telling you that you are doing something that you should not be doing in a production cluster.
The real solution to this is to explicitly name your dc's using GossipingFileSnitch and not rely on SimpleSnitch which names based on the DSE workload.
In this case, switch to GPFS and set the DC name to Cassandra.

DataStax OpsCenter v5.2.4 Create Cluster Error

Using DataStax OpsCenter v5.2.4 (currently the latest) installed using AMI ami-8f3e2bbf provided by DataStax and following DataStax's instructions on how to create a cluster on EC2, all DSE nodes fail during creation with this error:
Install Errored: Could not find a matching version for package dse-libpig
Is there a work around for this?
Note that during the process I selected Package: DataStax Enterprise 4.8.1, which is the latest available in the list at this time.

I faced the same issue and taking a clue from BrianC's comment resolved by removing a trailing '#' in my datastax account password

migrating cassandra from 1.1.2 to 1.2.6

My current cassandra version is 1.1.2, it is implemented with a single node cluster, i would like to upgrade it 1.2.6 with multiple nodes in the ring. is it a proper way to migrate it directly to 1.2.6 or i should follow version by version migration.
I found the upgrading steps from this link
http://fossies.org/linux/misc/apache-cassandra-1.2.6-bin.tar.gz:a/apache-cassandra-1.2.6/NEWS.txt.
There are 9 other releases available between this two versions.

I migrate a two cluster nodes from 1.1.6 to 1.2.6 without problems and without doing version by version. Anyway, you should take a closer look into:
http://www.datastax.com/documentation/cassandra/1.2/index.html?pagename=docs&version=1.2&file=index#upgrade/upgradeC_c.html#concept_ds_smb_nyr_ck
Because there are a lot of new features from version 1.2 like the partioners maybe you need to change some configurations for your cluster.

You may directly hop on to C1.2.6.
We migrated our 4-node cluster from C1.0.9 to C1.2.8 recently without any issues. This was a rolling upgrade i.e. upgrade one node at a time and after each upgrade of a node, allow the cluster to stabilize (depends upon the traffic during upgrade)
These are the steps that we followed:
Perform below on each node,
Run Disablegossip and disablethrift, such that this node is seen as DOWN by other nodes.
flush/drain the memtables, run compaction to merge SSTables
take snapshot and enable incremental backups
This stops all the other nodes/clients from writing to this node and since memtables are flushed to disk, startup times are fast as it need not walk-through commit logs.
stop Cassandra (though this node is down, cluster is available for write/read, so zero downtime)
upgrade sstables to new storage format using sstableupgrade
install/untar Cassandra 1.2.8 on the new locations
move upgraded sstables to appropriate location
merge Cassandra.yaml from previous version and current version by a manual diff (need to detail out difference)
start Cassandra
watch the startup messages to ensure the node comes up without difficulty and is shown in the ring with mixed 1.0.x/1.2.x

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string