Which Apache cassandra version to use for production - cassandra

We are exploring apache cassandra and are going to use it for Production soon.
We are going to use mostly Datastax community edition of apache cassandra.
But after reading :
http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/
https://www.pythian.com/blog/cassandra-version-production/
With this sentence from above blog “If you don’t mind facing serious bugs and contribute to the development pick 3.x”
I am confused about which version to opt for our production deployment ?
Just need to know whether 3.5.0 and 3.0.6 are production ready.
Datastax community : 3.5.0 from http://www.planetcassandra.org/cassandra/
Datastax community : 3.0.6 from
http://www.planetcassandra.org/archived-versions-of-datastaxs-distribution-of-apache-cassandra/
or
Datastax community : 2.2.6 from
http://www.planetcassandra.org/archived-versions-of-datastaxs-distribution-of-apache-cassandra/

The version provided by datastax is supposed to be stable and production ready. You have an application to monitor your cluster, which is nice if you don't have any ops that knows about cassandra in the first place, and you can pay to get support.
However, you don't have the latest version of Cassandra, and you can miss interesting features.
As for Cassandra 3.x, as said above, you get more features (for example JSON support) and better performance, but if you find a critical bug and can't fix it, you can only writes a ticket and hope they will take care of it quickly. Yet it is production ready and this could work well for you.
In conclusion, go for the latest version only if you need a special feature, or if you have the developers in your team to back your choice. Go for Datastax if you want something that works with less effort.

Related

Recommended Datastax Drivers version to connect to Cassandra Version 4.0

Currently we are using Java Datastax Drivers Version 3.7.2 to connect to Open source Apache Cassandra Version 3.11.9.
We are planning on to upgrade to Open Source Apache Cassandra Version 4 , Can someone please let me know what are the recommended Java Datastax Drivers version to connect to Cassandra Version 4. I see in this article Datastax had mentioned that Datastax Drivers Version 3.11 is partially compatible with Cassandra Version 4.X and did not have much information on what they mean about partially compatible? 
https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html
First, Apache Cassandra® 4.1 is already released last Dec and you may want to look at upgrading to that as opposed to 4.0.x.
Next, partially compatible is also explained in the docs section as,
^4^ Limited to the Cassandra 3.x and 2.2.x API.
Also, I'm taking excerpts from the mailing list discussions here.
Neither the 4.x nor the 3.x Java drivers are in maintenance mode at the moment. It is very much true that any new Java driver features will be developed on the 4.x branch and in general will not be ported to 3.x. 3.x will continue to receive CVE and other critical bug fixes but as mentioned there are no plans for this branch to receive any new features. It's not completely impossible that a specific feature or two might make it's way to 3.x on a case-by-case basis but if you're planning for the future with 3.x you should do so with the expectation that it will receive no new features.
&
Having said that, I would strongly recommend and encourage you to upgrade to the 3.11.3 version of the java driver (released on Sep 20, 2022) which is directly binary compatible with the version that you're using today, 3.7.2 (released on Jul 10, 2019), to leverage features and fixes (including many CVE patches). In addition, I would also suggest you to sketch out a plan to upgrade your apps to 4.x driver or look into modernizing to interact with your Apache Cassandra®/DSE®/Astra DB® cluster via the Stargate® APIs.

Upgrading from gridgain to Apache Ignite

We're currently running gridgain 6.2.1. Is there an existing upgrade guide in order to transition to apache ignite?
There is no such guide and it highly depends on what parts of GridGain you're using. All functionality that existed in 6.x was migrated to Ignite with a bit different API. So I suggest to update the version and start fixing compilation step by step.

YCSB for Cassandra 3.0 Benchmarking

I have a cassandra ubuntu visual cluster and need to benchmark it.
I try to do it with yahoo's ycsb (without use of maven if possible).
I use cassandra 3.0.1 but I cant find a suitbale version of ycsb.
I dont want to change to an oldest version of cassandra (ycsb latest cassandra-binding is for cassandra 2.x)
What should I do?
As suggested here, despite Cassandra 3.x is not officially supported, you can use the cassandra-cql binding.
For instance:
/bin/ycsb load cassandra-cql -threads 4 -P workloads/workloada
I just tested it on Cassandra 3.11.0 and it works for both load and run.
That said, the benchmark software to use depends on your test schedule. If you want to benchmark only Cassandra, then #gsteiner 's solution might be the best. If you want to benchmark different databases using the same tool to avoid variability, then YCSB is the right one.
I would recommend using Cassandra-stress to perform a load/performance test on your Cassandra cluster. It is very customizable, to the point that you can test distributions with different data models as well as specify how hard you want to push your cluster.
Here is a link to the Datastax documentation for it that goes into how to use the tool in depth.
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html

Does Cassandra works with IBM JVM

Can I install and start Cassandra into a x-linux OS with a IBM SDK for Java?
Will that work? Any specific version? 2.1, 2.0 that will work ?
Thanks in advance.
You are right, it should work. The only issue is in the Cassandra-env.sh, you need to comment out some checking.
Yes it should. According to the Apache Cassandra project site:
Cassandra requires the most stable version of Java 7 or 8 you can deploy, preferably the Oracle/Sun JVM. Cassandra also runs on OpenJDK and the IBM JVM.
As far as I can tell, it doesn't indicate that only specific versions of Cassandra work with the IBM JVM. That being said, the documentation on the DataStax site specically mentions the Oracle JVM in the installation steps. It is the recommeded JVM to use.
I can't remember for sure, but I have heard of people running into issues that were traced back to the OpenJDK. I don't recall anything specific to the IBM JVM. So it might work, but it isn't supported and you might leave yourself open to some unforeseen errors.

Apache Cassandra vs Datastax Cassandra [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Is Datastax Cassandra the only available Cassandra that can be used in a production environment? Is there any free alternatives available? What about the cassandra available on Apache site?
Datastax Community Edition is also free, it contains a basic version of OpsCenter -- http://planetcassandra.org/cassandra/
Here is the difference between the community edition and DSE
http://www.datastax.com/download/dse-vs-dsc
They can both be used in production. DataStax Enterprise comes with a bunch of extra features on top of Apache Cassandra, and also comes with support.
Datastax is a commercial company, who supports C*. The base source code of Cassandra is taken of the Apache Repositories, then some of their own code is merged. Besides this, as already mentioned by others, Datastax version comes with some additional tools for maintaining a Cassandra Cluster.
One of the benefits of Datastax Enterprise is their neatless SOLR Integration, another great Apache Foundation Project.
Cassandra comes with a Query Language called CQL (Cassandra Query Language) which is "similar" to SQL, you should however think of CQL like a cousin of SQL, not a brother.
One of the great features of the Enterprise edition is that you can query a SOLR index through their CQL integration, also a Cassandra Cluster shares it's resources with SOLR, so you don't need a second Cluster for SOLR.
You could... set up Apache or Datastax Cassandra, you would get almost the same thing, but if you need something similar to SQL Like Statement (natively not available in Cassandra), or you do have a very much denormalized database and you need search capabilities, then Datstax Enterprise (DSE) is your only viable choice.
As someone already has mentioned, DSE is free for startups until they reach an annual revenue of 3m USD, or are funded with 30m. This should give everybody the opportunity to leverage the power of NoSQL and use one of the most reliable databases for big data out there.
For the Cassandra product, you can use the Apache open source offering in production, if your organisation is comfortable with open source.
You can also use the Datastax Community version of Cassandra, which is also open source and free to deploy; that gives you a bit more assurance from DataStax who offer commercial support.
Then there is DataStax Enterprise, which is the version that you pay to use, with a support model included. This still uses open source Cassandra, with additional code from DataStax. They have also put this release through their internal test processes, so that they are happy to support it. That generally means the releases will lag that Apache and Community versions, if that matters to you.
The DataStax 'Dev Center' product is a GUI tool that allows you to enter CQL commands against a Cassandra installation - it is free to use against any release. You may find it useful, though the CQLSH command-line should offer much of what you may need (and Cassandra CLI).
The DataStax 'Ops Center' product is available in a free version, which can run against any Cassandra with the associated 'DataStax Agent' used to collect data from each node. The Enterprise version of Ops Center includes additional functionality; that is available if you purchase the fully support DSE (DataStax Enterprise) stack.
Hope that helps. Much more information available at Planet Cassandra and the DataStax web sites.
Besides Apache Cassandra, there's Scylla which is a drop in replacement for Cassandra written in C++. It claims to be 10 times faster than Apache Cassandra. However, Scylla is still in alpha version, and you should stay away from it in a production environment.
Scylla aims to support all cassandra features together with toolings. It also supports JMX monitoring.
Apache Cassandra also have all features as well as community edition of DataStax . So you can put Apache Cassandra on Production enivorment .
Another good feature of DSE is the ability to do backup and recovery of your Cassandra database which I would think is very important if you are planning to use this in a production setup.

Resources