How do I build Cassandra from GitHub source? - cassandra

I found this repo on github: https://github.com/apache/cassandra
And I would like to import it into intellij and build it in order to run some code locally that I want to build on top of this github code. But there are no instructions for building it.
Where are the instructions?

Thank you for taking interest in writing Cassandra code.
The instructions for building Cassandra from source code including IDE integration is documented in the Contributing to Cassandra page on the official Apache Cassandra website. There are instructions for IntelliJ, NetBeans and Eclipse.
It's not as straightforward as we would like because everyone's laptop/desktop is different so I would recommend you join the ASF Slack to get help in real-time from other Cassandra contributors in the #cassandra-dev channel. For details, see the Community section of the Cassandra website. Cheers!
👉 Please support the Apache Cassandra community by hovering over cassandra then click on the Watch tag button. 🙏 Thanks!

Related

Airflow 2.0 Docker setup

Recently been trying to learn Airflow, but a majority of resources online depended on this repo https://github.com/puckel/docker-airflow which unfortunately has been removed.
I am not familiar with docker so I'm just trying to set up locally and play around with Airflow. I'm on a windows setup and have already gotten docker working on my computer. Does Airflow have a quick-set-up file for a docker-compose? Or is there any other resources I can look at? Thanks.
Its a duplicate question.
Use official official docker-compose.yml see here
I recently added a quick start guides to the official Apache Airflow documentation. Unfortunately, this guide has not been released yet. It will be released in Airflow 2.0.1.
For now, you can use the development version, and when a stable version is released it will be very easy for you to migrate. I don't expect any major changes to our docker-compose.yaml files.
http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow/latest/start/docker.html

Sentiment analysis using spark and Stanford NLP API

When I wanted to do a sentiment analysis project I searched alot online, and atlast I landed on this website, which explained the code but what it did not explain is how to use spark with respect to the code, I mean where to add the code.
Website :http://stdatalabs.blogspot.in/2017/09/twitter-sentiment-analysis-using-spark.html?m=1
It will be of great help, if anyone can explain me completely, as Iam a begginer and this my first project on big data.
Thank you.
In the bottom there is a link to the github (https://github.com/stdatalabs/sparkNLP-elasticsearch) you should check that out (literally)
The main class is
com.stdatalabs.SparkES.TwitterSentimentAnalysis according to the pom.xml
So running mvn package will yield you an executable .jar (user java -jar)
Running the jar will prompt you for some twitter config (keys, etc) and saves to a local es cluster using hardcoded index (& mapping) twitter_020717/tweet
You can now alter the code anyway you want, build, run, and check the results.

Why not publish the voltdb.jar to maven repo?

I only found that the voltdb client in the maven repo, but not fount the voltdb which contains the VoltProcedure.
It will be hard for me to manage the dependencies with maven, gradle or other tools.
Is there any deep reason for that? voltdb guys.
I work at VoltDB. We have a feature request ticket to add the voltdb-.jar to the maven repo, so there is no deep reason it is not there yet, only limited time and resources.
You may want to review our recently updated instructions for setting up Eclipse for running JUnit tests of stored procedures, or running procedures with the debugger. It was recently moved to our examples/HOWTOs folder provided with the kit, and is available on github here.
Are you a working with anyone in our organization to evaluate VoltDB for a project? We have Solution Architects that can assist you with technical issues if you'd like to contact us at info (at) voltdb (dot) com.
Best regards,
Ben

External Authentication for Cassandra in DSE 4.7

We are trying to implement external authentication to Cassandra on DSE 4.7. Followed few of the guides where we have to extend IAuthenticator class but after doing that there is less documentation on how to integrate.
Is it more of plug and play where we extend IAuthenticator class build a jar and place it in lib(/usr/share/dse/resources/cassandra/lib) and change the yaml file accordingly or is it take a source code from Github build entire tree and then use?
If so is Datastax's Cassandra available on Github?
What do we need to do to build external authentication other that LDAP and Kerberos in DSE 4.7?
extend IAuthenticator class build a jar and place it in
lib(/usr/share/dse/resources/cassandra/lib) and change the yaml file
accordingly
^^ yes, this is the right approach.
Datastax's Cassandra available on Github?
Not exactly. You'll see the version of c* that ships with DSE in the release notes, you can check the source in the apache/cassandra github and it will match (up to and excluding the build number). The exact c* build under DSE will have some critical patches from future versions and that exact source code is not avaliable. However, the dot release in apache/cassandra is good enough for all intents and purposes.
I.E. look at https://github.com/apache/cassandra/tree/cassandra-2.1.8 for 4.7.1
As mentioned by #Mikea we need to override ISaslAwareAuthenticator and while using Cassandra in DSE need to be very sure of Cassandra version and then dig into appropriator Github repo.

How to Use Apache Drill with Cassandra

I am trying to query Cassandra using Apache Drill. The only connector I could find is here:
http://www.confusedcoders.com/bigdata/apache-drill/sql-on-cassandra-querying-cassandra-via-apache-drill
However this does not build. It comes up with an artifact not found error. I also had another developer who is more versed in these tools take a stab at it, but he also had no luck.
I tried contacting the developer of the plugin I referenced, but the blog does not work and won't let me post comments. Has anyone got this plugin to work (if so how?) or is there another plugin or method I can use to connect apache drill to Cassandra? If anyone could show me how to connect an execute a simple SQL query that would be much appreciated.
I looked at the latest Cassandra storage plugin patch and the latest apache drill source. The drill code has changed and the patch can no longer be applied.
I then manually took the patch apart (it id mostly diff output). Most of the patch was new classes which I could easily add to the latest drill source tree. Most of the other updates were easy to insert into the current source. There were two specific classes that required some minor code modifications/extensions. I rebuilt the distribution from the modified source and installed the drill servers it on a 3 node cluster. The Cassandra schema failed to initialize properly throwing a null pointer exception one of the new classes. This leads me to believe that the (latest) modifed storage plugin is incompatible with the latest version of Cassandra. Since the author of the original storage plugin is unreachable and no one else is stepping up to support the code, this is a dead horse. Beat it if you must.
I was the author of the patch written a year back. Could not get it merged into Drill then, and later got occupied with other stuffs :(
With so many changes to Drill internals, I am not sure what amount of welding would be needed at this point to get it working. Please use the code just as a reference for writing a Drill storage plugin.
Have added this banner on top of the blog post to save fellow developer's hours.
I don't know if anyone is still interested in this topic but I've been experimenting with this plugin and got it to work with Drill 1.18-SNAPSHOT. Here is a link to my branch with this code: 1. My plan is to submit this as a PR for Drill, but it still needs some work. This code will successfully query Cassandra 3.11.5 (latest stable version).

Resources