Better understand SSTables Formats big vs bti - cassandra

Where can I find more information on which Version of Cassandra supports which version of SSTables.
Recently I noticed DSE Cassandra is generating SSTables bti, while Apache Cassandra 3.11.4 continues to generate big.
Do you know what is the difference and when would Apache Cassandra start with bti
Thanks in advance!

The bti file format is proprietary file format developed by Datastax for DSE 6, so no information about its internals is available. It has a number of optimizations, so, for example, the key cache is not required anymore, etc. Apache Cassandra won't support it until the details of format will be opened by Datastax.

Related

When there are minor version differences in Cassandra, node operation and different version upgrade issues

Excuse me,
Can 3.11.10 nodes be added to the 3.11.4 cluster?
If I want to upgrade from 3.11.4 to 3.11.10, do I need to run upgradesstables?
Thank you!
Usually it's not recommended to mix different versions of Cassandra inside the cluster, except the time when you're doing upgrades. This comes from the possibility of having some differences in the streaming protocol that is used for bootrapping/removing the nodes, and doing repairs. But it could be ok for versions inside the same major version (3.11), but it makes sense to change the changelog for any changes that may affect streaming.
For upgrade from 3.11.4 to 3.11.10 you don't need to run upgradesstables - this step is always optional, as SSTables will be written in the new format when the compaction happens. Usual recommendation to execute it explicitly mostly for cases where you can benefit from better performance using the new file format, or having bug fixes.

Does DataStax OpsCenter support Cassandra 2.2.5?

I've having some issues with my Cassandra cluster and I would like to install OpsCenter Community in order to debug what's going on.
I've found this and this pages talking about the compatibility between DataStax OpsCenter and Cassandra, but this don't list Cassandra 2.2.5 (actually I'm using DataStax Cassandra - dsc 22).
My question is: can I use DataStax OpsCenter (free / community version) within Cassandra 2.2.5? If not, there's an alternative?
No, the docs you cited indicate that OpsCenter doesn't support any cassandra greater than 2.1.x and the next version of opscenter (6.x) will only support Datastax Enterprise. I don't know of another visual front-end to cassandra at this time.

Migrating from Cassandra 2.2.0 to DSE 4.8.5 (Cassandra 2.1.3)

I have been building an application using Apache Cassandra 2.2.0 for sometime now. We plan to start using the DataStax Enterprise 4.8.5 (this is built on Apache Cassandra 2.1.3).
Problem is as this is a downgrade of Cassandra version, 2.2.0 -> 2.1.3, I am not able to read the SSTables created by Cassandra version 2.2.0.
What can I do to have my old data available with DSE 4.8.5?
This is not supported. You should consider contacting Datastax for advice (that's one of the advantages of paying for DSE, you get someone to talk to about topics like this).
You'd almost certainly have to export the data and re-import it (either using sstable2json or COPY TO+COPY FROM to export it to a CSV using CQLSH, or using something like Spark or CQLSSTableWriter to recreate 2.1 sstables.

Strip Datastax binary to have only Cassandra

I have downloaded latest Datastax binary - 4.5.2. It comes loaded with hive, hadoop, solr etc etc which I am not interested in. I just want to bundle Cassandra with my product. I tried removing all the folders from dse-4.5.2/resources but cassandra and tried starting cassandra by executing below command from dse-4.5.2/bin
./dse cassandra
However it failed. So looks like its not as simple as deleting folders.
Has any one ever tried this?
DSE will not use hive, hadoop, solr, etc. unless you explicitly ask it to.
I.E. in order to start DSE with search run:
dse cassandra -s
If you just start using dse cassandra it will only start the cassandra process.
I'd recommend using apache cassandra for this. Here's a puppet module that you might like: https://github.com/heartysoft/puppet-cassandra

DataStax Enterprise with HDFS and Spark without Cassandra

Is it possible to work with DSE, HDFS, Spark, but without Cassandra?
I try to replace CFS (Cassandra File System) with HDFS (Hadoop in DSE)
dse hadoop fs -help
needs cassandra.
Cassandra takes a lot of memory, I hope that with HDFS-only we've get more free-RAM on node.
Calling DSE Hadoop is actually using the Cassandra file system instead of HDFS so you cannot run it without Cassandra running. Datastax does support a BYOH (bring your own Hadoop) option but that involves using a third party Hadoop. If you don't want Cassandra though I would not recommend using the DSE packaging.

Resources