Cassandra: need to migrate cassandra to log4j

Cassandra: need to migrate cassandra to log4j - cassandra

We are using embedded cassandra in our groovy test cases, we are migrating from logback to log4j2. Whenever i run the groovy test which uses cassandra it gives an exception of NoClassDefFoundError for ch/qos/logback /classic /Logger. I have excluded logback dependency from all existing cassandra dependency still its looking for logback. How should i make cassandra log using log4j2

Cassandra isn't setup or designed to run embedded so while there might be some hacks that can get you by it will be something difficult to keep working across versions.
I would recommend using ccm for your tests to run it out of jvm and it will also give you more control for interesting configurations. The java driver has a useful bridge for java applications in their tests here: CCMBridge.java
Longterm you might be able to use something CASSANDRA-14821 as there will be native connections exposed and give you a lot more control over results of queries and such.

Related

cassandra database testing with jmeter

I have installed cassandra CQL shell on my local system, I am using jmeter v3.0 for testing the queries per second(QPS) on cassandra CQL shell. I have installed "cassandra support" plugin available in "plugin manager" with jmeter.
I have created keyspace in cassandra(keyspace1), created a table(student) and added some data in CQL shell.
I have added "cassandra properties" from config elements and entered the properties in jmeter.
Here are the properties:
I have added "cassandra get" sampler.
added "view results tree" listener.
when I run it I am getting the following error:
ERROR: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: Read a negative frame
size (-2080374784)!
I have given the "schema properties" as seen on github.
but no use. I am getting the same error.
Can anyone suggest me how to resolve this error?
ERROR: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: Read a negative frame
size (-2080374784)!
I want to use the cassandra samplers for put, get and delete operations on the database.

It looks like the Netflix plugin you are using is somewhat deprecated because it uses the Cassandra thrift API which is deprecated (the plugin did not have a lot of recent commits in github too).
See announcements here and here
Even if you succeed in your test with this plugin, it would not be very representative of a current client use (hence load).
IMHO you should make your test with JSR223 groovy scripts (preprocessor and samplers) and use the Datastax standard java driver + CQL in your script. I did it some time ago, it works fine.
(update: documented here)
Or may be try this JMeter plugin from a Datastax guy, it seems to use CQL . I didn't tried it, but it looks fine.
HTH,
Alain

YCSB for Cassandra 3.0 Benchmarking

I have a cassandra ubuntu visual cluster and need to benchmark it.
I try to do it with yahoo's ycsb (without use of maven if possible).
I use cassandra 3.0.1 but I cant find a suitbale version of ycsb.
I dont want to change to an oldest version of cassandra (ycsb latest cassandra-binding is for cassandra 2.x)
What should I do?

As suggested here, despite Cassandra 3.x is not officially supported, you can use the cassandra-cql binding.
For instance:
/bin/ycsb load cassandra-cql -threads 4 -P workloads/workloada
I just tested it on Cassandra 3.11.0 and it works for both load and run.
That said, the benchmark software to use depends on your test schedule. If you want to benchmark only Cassandra, then #gsteiner 's solution might be the best. If you want to benchmark different databases using the same tool to avoid variability, then YCSB is the right one.

I would recommend using Cassandra-stress to perform a load/performance test on your Cassandra cluster. It is very customizable, to the point that you can test distributions with different data models as well as specify how hard you want to push your cluster.
Here is a link to the Datastax documentation for it that goes into how to use the tool in depth.
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html

For Cassandra kundera.client.lookup.class options

In order to configure kundera for Cassandra, I notice there are 3 possible options for kundera.client.lookup.class as below
com.impetus.client.cassandra.pelops.PelopsClientFactory
com.impetus.kundera.client.cassandra.dsdriver.DSClientFactory
com.impetus.client.cassandra.thrift.ThriftClientFactory
I am not sure of the Pros and Cons of the above 3 and hence not sure which one to use. Please help me decide

I suggest you to use com.impetus.client.cassandra.thrift.ThriftClientFactory. It is the implementation using just Cassandra's thrift api.
PelopsClient is not in active development.
DSClient is built over datastax driver of cassandra.
There is no real advantage of using either DSClient or ThriftClient.

After further research, I found the following
Don't use PelopsClient as its not in active development as mentioned by #karthik , but more importantly because of the issue reported here
Data Stax Driver is better than thrift client as it over comes few limitations of thrift and they use a different binary protocol specific to cassandra which gives a better performance. Refer Datastax java driver support for Cassandra using Kundera

Differences betweeen Hector Cassandra and JDBC

I'm currently starting a project that use Cassandra Apache. So I'm interesting in accessing to my database cassandra from Java. For that, I'm using Hector Cassandra. However, I've some doubts about what's the differences between the access via Hector or JDBC Cassandra (specifically this: https://code.google.com/a/apache-extras.org/p/cassandra-jdbc/).
I believe the following (although I not sure if I'm right):
one difference between both could be that are API of different level (I consider that Hector Cassandra is an API of higher-level than JDBC Cassandra)?
in JDBC Cassandra is used CQL for accessing/modifying the database, while Hector Cassandra don't use CQL (only use the methods provided for that).
I'll be thankful if someone can help me and tell me if I'm right/wrong in the previous lines and more differences between both (Hector and JDBC Cassandra).
Thank in advance!

Official Cassandra Java Driver (https://github.com/datastax/java-driver) is probably the best (IMHO, the only) choice for a new project for several reasons:
New features
All other Cassandra clients (Hector, Astyanax, etc) are based on legacy Thrift RPC protocol. RPC "One response per one request" model has severe limitations, for example it doesn't allow processing several requests at the same time in a single connection or streaming large ResultSets.
So, DataStax developed a new protocol that doesn't have RPC limitations. Thrift API won't be getting new features, it's only kept for backward-compatibility. In contrast, Java Driver is actively developed to incorporate the new features of Cassandra 2.0, like conditional updates, batching prepared statements, etc. The overview of new features is here: http://www.datastax.com/dev/blog/cql-in-cassandra-2-0
Convenience
In early Cassandra days (0.7) in our company we have used in-house low-level Thrift client. Later on we have used Hector, Pelops and Astyanax in various projects. I can say that the clients based on Java Driver look the most simple and clean to me.
Performance
We have made some performance testing of Cassandra Java Driver vs other clients. In most scenarios the performance is roughly the same. However, there are certain situations when Cassandra Java Driver significantly outperforms other clients due to its asynchronous nature.
Btw, there's a couple of related questions with excellent answers:
Advantages of using cql over thrift
Cassandra Client Java API's
EDIT: When I wrote this, I wasn't aware that Achilles (https://github.com/doanduyhai/Achilles) mentioned in another answer has CQL implementation that works via Java Driver. For the same of completeness I must say that Achilles' DAO on top of CQL might be (or might became one day) viable alternative to plain CQL via Java Driver.

#mol
Why do you restrict to Hector and cassandra-jdbc if you're starting a new project ?
There are many other interesting choices:
Astyanax as Martin mentioned (Thrift & CQL3)
FireBrand (Thrift via Hector)
Achilles I've just developed (CQL3 & Cassandra 2.0 via Java driver core)
Java Driver Core for plain CQL3

Hector is indeed a higher-level API. Internally it will use Cassandra's Thrift API to execute its functions. It will not convert them to equivalent CQL calls. But its API also provides access to CQL. In this case it will pass the CQL (via Thrift) to Cassandra's APIs for CQL.
CQL in Cassandra is a SQL-like language that works via the Cassandra APIs. So it does not provide any additional capability in the use of Cassandra than the APIs but does make it easier at times to use. If you are considering using Hector I would also look at Astyanax which is a newer take on a high-level Java API to Cassandra.

Since you are starting a new project, it is best to start with CQL as Java native driver:
http://www.datastax.com/documentation/developer/java-driver/1.0/webhelp/index.html#common/drivers/introduction/introArchOverview_c.html
Per DataStax, it is 10-15% faster than Thrift APIs, as it uses Binary Protocol.

Using Cassandra JDBC-compliant driver with DataNucleus JDO

I'm considering to use JDBC-compliant driver with JDO to connecto to Cassandra. Is this possible and is this going to cause huge overhead? I was looking astyanax made by Netflix, and it looks good, but it is not easy as JDO seems to be.

If using a JDBC driver you need to write an RDBMS adapter class for Cassandra to communicate with it (see the DN docs).
Alternatively use a Cassandra plugin for DataNucleus https://github.com/pulasthi/Datanucleus-Cassandra-Plugin
Note that this plugin was not provided by the DataNucleus project (so some things in it may be non-optimal due the people concerned not necessarily understanding how a store plugin ought to be written) and only works for DataNucleus v2.x.
Update [Jan 2014] : there is now an official DataNucleus Cassandra plugin under development, already providing many things, and using CQL3

The Cassandra CQL JDBC driver is located at http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string