DSE 5 and DSE 4.8.9 in Same Cluster - cassandra

Is it at all possible to have two different DSE versions in the same cluster? In my case, I have a a cluster of two DSE 5 nodes and another one of two DSE 4.8.9 nodes. Can I connect them such that data is replicated from DSE 4.8.9 to DSE 5 in real time?

No. If you were to try this, you'd be in an "Upgrade State." And clusters in an upgrade state are bound by these restrictions:
Do not enable new features.
Do not run nodetool repair.
Do not issue
these types of CQL queries during a rolling restart: DDL and
TRUNCATE.
During upgrades, the nodes on different versions might show
a schema disagreement.
Failure to upgrade SSTables when required
results in a significant performance impact and increased disk usage.
Upgrading is not complete until the SSTables are upgraded.
Trying something like this would be further exacerbated by the fact that 4.8.9 is based on Cassandra 2.1 and 5.0 is based on Cassandra 3.0. There were some significant changes between the two, so you would undoubtedly run into problems.
The best way to go about this would be to upgrade your 4.8.9 nodes to 5.0 first, and then add your new 5.0 cluster nodes afterward.

Related

New Cassandra nodes in cluster with different version

I have a Cassandra cluster with 3 nodes running on v3.11.4
I want to add 3 more nodes to this cluster. Now, Cassandra v4 is available so i have installed it on the new nodes.
When i restart Cassandra, the new nodes are unable to join the cluster.
Error: There are nodes in the cluster with a different schema version than us
I even tried added skip_schema options in jvm-server.options file but still the nodes could not join.
Please help me how can i add the new nodes in the existing cluster. I want to keep v4 for new nodes so i don't have to update these when upgrading older nodes to v4.
It isn't possible to add nodes running a new major version to the cluster. You will only be able to add nodes running Cassandra 3.11.
They won't be able to stream data to each other because they have different formats. This is the reason you can't run repairs during an upgrade. You also can't add or decommission nodes in the middle of an upgrade. Cheers!
So the plan forward here, would be to shut down the Cassandra 4.0 nodes. Then upgrade the 3.11 nodes to 4.0. Then adding the new 4.0 nodes should work as expected.

Migrate Datastax Enterprise Cassandra to Apache Cassandra

We have currently using DSE 4.8 and 5.12. we want to migrate to apache cassandra .since we don't use spark or search thought save some bucks moving to apache. can this be achieved without down time. i see sstableloader works other way. can any one share me the steps to follow to migrate from dse to apache cassandra. something like this from dse to apache.
https://support.datastax.com/hc/en-us/articles/204226209-Clarification-for-the-use-of-SSTABLELOADER
Figure out what version of Apache Cassandra is being run by DSE. Based on the DSE documentation DSE 4.8.14 is using Apache Cassandra 2.1 and DSE 5.1 is using Apache Cassandra 3.11
Simplest way to do this is to build another DC (Logical DC per Cassandra) and add it to the existing cluster.
As usual, with a "Nodetool Rebuild {from-old-DC}" on to the new DC nodes, let Cassandra take care of streaming data to the new Apache Cassandra nodes naturally.
Once data streaming is completed, based on the LoadBalancingPolicy being used by applications, switch their local_dc to DC2 (the new DC). Once the new DC starts taking traffic, shutdown nodes in old DC say DC1 one by one.
alter keyspace dse_system and dse_security not using everywhere
on non-seed nodes, cleanup cassandra data directory
turn on replace in cassandra-env.sh
start instance
monitoring streaming process using command 'nodetool netstats|grep Receiving'
change seeds node definition and rolling restart before finally migrate previous seeds nodes.

Cassandra Upgrade limitations

We are upgrading from DSE 4.5 to DSE 4.8.9 in 10 node PRODUCTION cluster.
We have daily batch jobs running in our application which bulk load the data in the cluster , some jobs TRUNCATE the tables and load fresh data and some loader jobs which continuously insert the data.
Consider these scenarios :
Case 1 :
Let say my one node has DSE 4.8 installed but upgradesstables is running .
All nodes are online at this moment and 2 different schema exist (9 nodes on dse4.5 and 1 node on dse4.8.9).
In this case , will TRUNCATE work ?
Case 2:
One of my nodes is fully upgraded to DSE 4.8 , which makes my cluster to be in partially upgraded state, all nodes online,2 schema exists (9 nodes on DSE 4.5 and 1 node on DSE 4.8).
Will TRUNCATE work in this case ?
Please suggest.
Thanks!
Its not recommended to issue a TRUNCATE command during an upgrade, this is one of the limitations outlined here
To quote the link:
Do not enable new features.
Do not run nodetool repair.
Do not issue these types of CQL queries during a rolling restart: DDL
and TRUNCATE.
During upgrades, the nodes on different versions might show a schema
disagreement.
Failure to upgrade SSTables when required results in a significant
performance impact and increased disk usage. Upgrading is not complete
until the SSTables are upgraded.
It should be a practice to upgrade the binaries first on all the nodes so that we have one schema across the cluster .
Avoid the use of TRUNCATE till all nodes have completed running "upgradesstables".
Comment given by markc is also to be noted :
Do not enable new features.
Do not run nodetool repair.
Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
During upgrades, the nodes on different versions might show a schema disagreement.
Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.

Can we use sstableloader when the cluster is in partially upgraded state?

We have a cluster with 2 nodes on dse4.8 and one on dse4.5 . can we use sstableloader to stream snapshot data of dse4.5 in the cluster ?
Streaming is one of the operations you should avoid until your cluster is fully upgraded. Note during an upgrade you may see a schema mismatch across nodes. The upgrade limitations docs here outline some of the things you should avoid:
https://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdDSE47to48.html#upgrdDSE47to48__upglim
I can see that you're upgrading to DSE4.8 from DSE4.5. These versions use Cassandra 2.1 and 2.0 respectively. The sstable format changed between these two versions. So make sure you run upgradesstables also
It would be a good idea to complete your upgrade and then try to stream the data. You should use the DSE4.8 / C2.1 sstableloader to do the loading. It should stream in the older format tables. The following jira seems to infer that support for this was added
https://issues.apache.org/jira/browse/CASSANDRA-5772

migrating cassandra from 1.1.2 to 1.2.6

My current cassandra version is 1.1.2, it is implemented with a single node cluster, i would like to upgrade it 1.2.6 with multiple nodes in the ring. is it a proper way to migrate it directly to 1.2.6 or i should follow version by version migration.
I found the upgrading steps from this link
http://fossies.org/linux/misc/apache-cassandra-1.2.6-bin.tar.gz:a/apache-cassandra-1.2.6/NEWS.txt.
There are 9 other releases available between this two versions.
I migrate a two cluster nodes from 1.1.6 to 1.2.6 without problems and without doing version by version. Anyway, you should take a closer look into:
http://www.datastax.com/documentation/cassandra/1.2/index.html?pagename=docs&version=1.2&file=index#upgrade/upgradeC_c.html#concept_ds_smb_nyr_ck
Because there are a lot of new features from version 1.2 like the partioners maybe you need to change some configurations for your cluster.
You may directly hop on to C1.2.6.
We migrated our 4-node cluster from C1.0.9 to C1.2.8 recently without any issues. This was a rolling upgrade i.e. upgrade one node at a time and after each upgrade of a node, allow the cluster to stabilize (depends upon the traffic during upgrade)
These are the steps that we followed:
Perform below on each node,
Run Disablegossip and disablethrift, such that this node is seen as DOWN by other nodes.
flush/drain the memtables, run compaction to merge SSTables
take snapshot and enable incremental backups
This stops all the other nodes/clients from writing to this node and since memtables are flushed to disk, startup times are fast as it need not walk-through commit logs.
stop Cassandra (though this node is down, cluster is available for write/read, so zero downtime)
upgrade sstables to new storage format using sstableupgrade
install/untar Cassandra 1.2.8 on the new locations
move upgraded sstables to appropriate location
merge Cassandra.yaml from previous version and current version by a manual diff (need to detail out difference)
start Cassandra
watch the startup messages to ensure the node comes up without difficulty and is shown in the ring with mixed 1.0.x/1.2.x

Resources