Does Cassandra works with IBM JVM - cassandra

Can I install and start Cassandra into a x-linux OS with a IBM SDK for Java?
Will that work? Any specific version? 2.1, 2.0 that will work ?
Thanks in advance.

You are right, it should work. The only issue is in the Cassandra-env.sh, you need to comment out some checking.

Yes it should. According to the Apache Cassandra project site:
Cassandra requires the most stable version of Java 7 or 8 you can deploy, preferably the Oracle/Sun JVM. Cassandra also runs on OpenJDK and the IBM JVM.
As far as I can tell, it doesn't indicate that only specific versions of Cassandra work with the IBM JVM. That being said, the documentation on the DataStax site specically mentions the Oracle JVM in the installation steps. It is the recommeded JVM to use.
I can't remember for sure, but I have heard of people running into issues that were traced back to the OpenJDK. I don't recall anything specific to the IBM JVM. So it might work, but it isn't supported and you might leave yourself open to some unforeseen errors.

Related

Recommended Datastax Drivers version to connect to Cassandra Version 4.0

Currently we are using Java Datastax Drivers Version 3.7.2 to connect to Open source Apache Cassandra Version 3.11.9.
We are planning on to upgrade to Open Source Apache Cassandra Version 4 , Can someone please let me know what are the recommended Java Datastax Drivers version to connect to Cassandra Version 4. I see in this article Datastax had mentioned that Datastax Drivers Version 3.11 is partially compatible with Cassandra Version 4.X and did not have much information on what they mean about partially compatible? 
https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html
First, Apache Cassandra® 4.1 is already released last Dec and you may want to look at upgrading to that as opposed to 4.0.x.
Next, partially compatible is also explained in the docs section as,
^4^ Limited to the Cassandra 3.x and 2.2.x API.
Also, I'm taking excerpts from the mailing list discussions here.
Neither the 4.x nor the 3.x Java drivers are in maintenance mode at the moment. It is very much true that any new Java driver features will be developed on the 4.x branch and in general will not be ported to 3.x. 3.x will continue to receive CVE and other critical bug fixes but as mentioned there are no plans for this branch to receive any new features. It's not completely impossible that a specific feature or two might make it's way to 3.x on a case-by-case basis but if you're planning for the future with 3.x you should do so with the expectation that it will receive no new features.
&
Having said that, I would strongly recommend and encourage you to upgrade to the 3.11.3 version of the java driver (released on Sep 20, 2022) which is directly binary compatible with the version that you're using today, 3.7.2 (released on Jul 10, 2019), to leverage features and fixes (including many CVE patches). In addition, I would also suggest you to sketch out a plan to upgrade your apps to 4.x driver or look into modernizing to interact with your Apache Cassandra®/DSE®/Astra DB® cluster via the Stargate® APIs.

Upgrading from gridgain to Apache Ignite

We're currently running gridgain 6.2.1. Is there an existing upgrade guide in order to transition to apache ignite?
There is no such guide and it highly depends on what parts of GridGain you're using. All functionality that existed in 6.x was migrated to Ignite with a bit different API. So I suggest to update the version and start fixing compilation step by step.

Which Apache cassandra version to use for production

We are exploring apache cassandra and are going to use it for Production soon.
We are going to use mostly Datastax community edition of apache cassandra.
But after reading :
http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/
https://www.pythian.com/blog/cassandra-version-production/
With this sentence from above blog “If you don’t mind facing serious bugs and contribute to the development pick 3.x”
I am confused about which version to opt for our production deployment ?
Just need to know whether 3.5.0 and 3.0.6 are production ready.
Datastax community : 3.5.0 from http://www.planetcassandra.org/cassandra/
Datastax community : 3.0.6 from
http://www.planetcassandra.org/archived-versions-of-datastaxs-distribution-of-apache-cassandra/
or
Datastax community : 2.2.6 from
http://www.planetcassandra.org/archived-versions-of-datastaxs-distribution-of-apache-cassandra/
The version provided by datastax is supposed to be stable and production ready. You have an application to monitor your cluster, which is nice if you don't have any ops that knows about cassandra in the first place, and you can pay to get support.
However, you don't have the latest version of Cassandra, and you can miss interesting features.
As for Cassandra 3.x, as said above, you get more features (for example JSON support) and better performance, but if you find a critical bug and can't fix it, you can only writes a ticket and hope they will take care of it quickly. Yet it is production ready and this could work well for you.
In conclusion, go for the latest version only if you need a special feature, or if you have the developers in your team to back your choice. Go for Datastax if you want something that works with less effort.

YCSB for Cassandra 3.0 Benchmarking

I have a cassandra ubuntu visual cluster and need to benchmark it.
I try to do it with yahoo's ycsb (without use of maven if possible).
I use cassandra 3.0.1 but I cant find a suitbale version of ycsb.
I dont want to change to an oldest version of cassandra (ycsb latest cassandra-binding is for cassandra 2.x)
What should I do?
As suggested here, despite Cassandra 3.x is not officially supported, you can use the cassandra-cql binding.
For instance:
/bin/ycsb load cassandra-cql -threads 4 -P workloads/workloada
I just tested it on Cassandra 3.11.0 and it works for both load and run.
That said, the benchmark software to use depends on your test schedule. If you want to benchmark only Cassandra, then #gsteiner 's solution might be the best. If you want to benchmark different databases using the same tool to avoid variability, then YCSB is the right one.
I would recommend using Cassandra-stress to perform a load/performance test on your Cassandra cluster. It is very customizable, to the point that you can test distributions with different data models as well as specify how hard you want to push your cluster.
Here is a link to the Datastax documentation for it that goes into how to use the tool in depth.
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html

How to set up Spark cluster on Windows machines?

I am trying to set up a Spark cluster on Windows machines.
The way to go here is using the Standalone mode, right?
What are the concrete disadvantages of not using Mesos or YARN? And how much pain would it be to use either one of those? Does anyone have some experience here?
FYI, I got an answer in the user-group: https://groups.google.com/forum/#!topic/spark-users/SyBJhQXBqIs
The standalone mode is indeed the way to go. Mesos does not work under Windows and YARN probably neither.
Quick note, YARN should eventually work on Windows via the Hortonworks Data Platform (version 2.0 beta is on YARN but it is on Linux only at this time). Another potential route is to have it work against Hadoop 1.1 (Hortonworks Data Platform for Windows 1.1) - but your approach of having it run on Standalone mode is definitely the easiest to getting of the ground.

Resources