Datastax Cassandra java driver cluster configurations - cassandra

I have a cluster with 3 servers and I am trying to create Datastax Cassandra Cluster with the following configuration. Should I leave it to Datastax default values or what are the recommended values?
failover.retryPolicy = ?
failover.reconnectionPolicyDelayMs = ?
pooling.coreConnectionsPerHost = ?
pooling.maxConnectionPerHost= ?
socket.keepAlive=?
socket.reuseAddress=?
socket.tcpNoDelay=?
socket.receiveBufferSize=?
socket.sendBufferSize=?
Thanks in advance

Both Cassandra and the DataStax drivers are shipping with pretty sane default configurations. I'd say you should start developing your application and turn to tuning the configuration only at the time you notice that your specific scenario and deployment setup require changes.

Related

Spark JobServer can use Cassandra as SharedDb

I have been doing a research about Configuring Spark JobServer Backend (SharedDb) with Cassandra.
And I saw in the SJS documentation that they cited Cassandra as one of the Shared DBs that can be used.
Here is the documentation part:
Spark Jobserver offers a variety of options for backend storage such as:
H2/PostreSQL or other SQL Databases
Cassandra
Combination of SQL DB or Zookeeper with HDFS
But I didn't find any configuration example for this.
Would anyone have an example? Or can help me to configure it?
Edited:
I want to use Cassandra to store metadata and jobs from Spark JobServer. So, I can hit any servers through a proxy behind of these servers.
Cassandra was supported in the previous versions of Jobserver. You just needed to have Cassandra running, add correct settings to your configuration file for Jobserver: https://github.com/spark-jobserver/spark-jobserver/blob/0.8.0/job-server/src/main/resources/application.conf#L60 and specify spark.jobserver.io.JobCassandraDAO as DAO.
But Cassandra DAO was recently deprecated and removed from the project, because it was not really used and maintained by the community.

Cassandra production Monitoring

I am new to Cassandra and trying to setup monitoring to Cassandra production cluster.
Apart from monitoring using nodetool commands in crontab what else is recommended?
is it a general practice to use ganglia for monitoring?
can you direct me to a good resource on setting up monitoring in production.
we are using apache cassandra so opscenter was not very useful.
The free version of OpsCenter works with OSS Cassandra and most monitoring capabilities are available. You do miss a good amount of cluster management capabilities if you don't have DSE:
http://www.datastax.com/what-we-offer/products-services/datastax-opscenter/compare

How to setup opendaylight contoller cluster over cassandra

I have a Cassandra cluster running and now I want to set up a cluster of opendaylight controller over it.
The wiki page just mentions that I need to point the opendaylight to the cassandra cluster but I am unable to figure out how.
Please provide some details about it.
There is an updated wiki page about how to setup clustering at https://wiki.opendaylight.org/view/Running_and_testing_an_OpenDaylight_Cluster.
You will need to use the karaf distribution and the helium release.
http://www.opendaylight.org/software

Performing Analytics over Cassandra DB

I am working for a small concern and very new to apache cassandra. Studying about cassandra and performing some small analytics like sum function on cassandra DB for creating reports. For the same, Hive and Accunu can be choices.
Datastax Enterprise provides the solution for Apache Cassandra and Hive Integration. Is Datastax Enterprise is the only solution for such integration. Is there any way to resolve the hive and cassandra integration. If so, Can I get the links or documents regarding the same. Is that possible to work the same with the windows platform.
Is any other solution to perform analytics on cassandra DB?
Thanks in advance .
I was trying to download DataStax Enterprise (DSE) for Windows but found there is no such option on their website. I suppose they do not support DSE for Windows.
Apache Cassandra does have builtin Hadoop support. You need to set up a standalone Hadoop cluster colocated with Apache Cassandra nodes and then use ColumnFamilyInputFormat and ColumnFamilyOutputFormat to read/write data from/to your Hadoop cluster.

Mixing Datastax Enterprise with Cassandra community

I'm experimenting with Datastax Enterprise and I'm trying to have a cluster that mixes Enterprise nodes and standard Cassandra community nodes. I would only need a few nodes with advanced features like Solr and it would be nice to have all the nodes in the same cluster.
I tried to bootstrap a community node to a test Enterprise cluster, and it couldn't join the ring properly, throwing exceptions like that:
Unable to find compaction strategy class
'com.datastax.bdp.hadoop.cfs.compaction.CFSCompactionStrategy'
I assume that the Enterprise node tries to replicate CFs that have features from DSE, which are not recognized by the community node.
Is there a way to prevent that from happening? Am I trying to do something that's not possible/supported/allowed by DSE?
That is an unsupported configuration. The full cluster needs to be installed with DataStax enterprise binaries on all nodes. You can choose which nodes run as vanilla Cassandra, Hadoop or Solr by startup options on each node. DSE has a custom compaction strategy and snitch so that error is expected.

Resources