How do we execute load test in cassandra DB using JMeter 5.1. Is there any plugins available? - cassandra

I am trying to do load testing on cassandra DB. But when I check for JMeter cassandra pluggins there are 7 Cassandra Samplers in JMeter while using JMeter Pluggins installation. I have pretty good idea of the servers, keyspaces and connection
There is limited help in this regard and when searched for it is with the JMeter 2.9 versions.

It would be better that you use the cassandra-stress tool for this purpose. There are several resources available; as a starting point, you can look at datastax-academy or the last pickle blog.

Just like any other database (given it supports JDBC protocol)
Download Cassandra JDBC Driver (with all the dependencies) and drop it to JMeter Classpath
Restart JMeter to pick the libraries up
Add JSR223 Sampler to your Test Plan
Put the following code into "Script" area:
def cluster = com.datastax.driver.core.Cluster.builder().addContactPoint("IP Address or hostname of your Cassandra cluster").build();
def session = cluster.connect("your space");
def results = session.execute("SELECT * FROM users");
session.close();
cluster.close();
More information: Cassandra Load Testing with Groovy

Related

Why is so slow when use Spark Cassandra Connector in java code with Cassandra cluster?

we have tested a lot in many scenes in small data.
if use cassandra installed without cluster,then everything is ok,but if we use cassandra in cluster,then it will cost more then about 15 seconds at the same function.
Our java code is just as the sample code.Purely, call the dataset.collectAsList() or dataset.head(10)。
But if we use scala ,the same logic in spark-shell don't have the problem.
We have test a lot jdks and systems.Mac OS is fine, but window OS and linux OS like centos both have this problem.
collectAsList or head function,will try to getHostName,this is a expensive operation.So we can't use Ip to connect cassandra cluster, we have to use HOSTNAME to connect it.And it works!!!! the code of spark cassandra connector have to fix this problems.

Getting "AssertionError("Unknown application type")" when Connecting to DSE 5.1.0 Spark

I am connecting to DSE (Spark) using this:
new SparkConf()
.setAppName(name)
.setMaster("spark://localhost:7077")
With DSE 5.0.8 works fine (Spark 1.6.3) but now fails with DSE 5.1.0 getting this error:
java.lang.AssertionError: Unknown application type
at org.apache.spark.deploy.master.DseSparkMaster.registerApplication(DseSparkMaster.scala:88) ~[dse-spark-5.1.0.jar:2.0.2.6]
After checking the use-spark jar, I've come up with this:
if(rpcendpointref instanceof DseAppProxy)
And within spark, seems to be RpcEndpointRef (NettyRpcEndpointRef).
How can I fix this problem?
I had a similar issue, and fixed it by following this:
https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/spark/sparkRemoteCommands.html
Then you need to run your job using dse spark-submit, without specifying any master.
Resource Manager Changes
The DSE Spark Resource manager is different than the OSS Spark Standalone Resource Manager. The DSE method uses a different uri "dse://" because under the hood it actually is performing a CQL based request. This has a number of benefits over the Spark RPC but as noted does not match some of the submission
mechanisms possible in OSS Spark.
There are several articles on this on the Datastax Blog as well as documentation notes
Network Security with DSE 5.1 Spark Resource Manager
Process Security with DSE 5.1 Spark Resource Manager
Instructions on the URL Change
Programmatic Spark Jobs
While it is still possible to launch an application using "setJars" you must also add the DSE specific jars and config options to talk with the resource manager. In DSE 5.1.3+ there is a class provided
DseConfiguration
Which can be applied to your Spark Conf DseConfiguration.enableDseSupport(conf) (or invoked via implicit) which will set these options for you.
Example
Docs
This is of course for advanced users only and we strongly recommend using dse spark-submit if at all possible.
I found a solution.
First of all, I think is impossible to run a Spark job within an Application within DSE 5.1. Has to be sent with dse spark-submit
Once sent, it works perfectly. In order to do the communications to the job I used Apache Kafka.
If you don't want to use a job, you can always go back to a Apache Spark.

cassandra database testing with jmeter

I have installed cassandra CQL shell on my local system, I am using jmeter v3.0 for testing the queries per second(QPS) on cassandra CQL shell. I have installed "cassandra support" plugin available in "plugin manager" with jmeter.
I have created keyspace in cassandra(keyspace1), created a table(student) and added some data in CQL shell.
I have added "cassandra properties" from config elements and entered the properties in jmeter.
Here are the properties:
I have added "cassandra get" sampler.
added "view results tree" listener.
when I run it I am getting the following error:
ERROR: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: Read a negative frame
size (-2080374784)!
I have given the "schema properties" as seen on github.
but no use. I am getting the same error.
Can anyone suggest me how to resolve this error?
ERROR: java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: Read a negative frame
size (-2080374784)!
I want to use the cassandra samplers for put, get and delete operations on the database.
It looks like the Netflix plugin you are using is somewhat deprecated because it uses the Cassandra thrift API which is deprecated (the plugin did not have a lot of recent commits in github too).
See announcements here and here
Even if you succeed in your test with this plugin, it would not be very representative of a current client use (hence load).
IMHO you should make your test with JSR223 groovy scripts (preprocessor and samplers) and use the Datastax standard java driver + CQL in your script. I did it some time ago, it works fine.
(update: documented here)
Or may be try this JMeter plugin from a Datastax guy, it seems to use CQL . I didn't tried it, but it looks fine.
HTH,
Alain

DataStax Enterprise: Submitting spark 0.9.1 app to DSE cluster in a right way

I have a running analytics(Spark Enabled) dse cluster of 8 nodes. Spark Shell is working fine.
Now I would like to build a spark app and deploy it on the cluster using the command "dse spark-class" that I guess is the right tool for the job, according to the dse documentation.
I built the app with sbt assembly and I got the fat jar of my app.
Then after a lot of digging I figured out to export the env var $SPARK_CLIENT_CLASSPATH, because it is referenced by the spark-class command
export SPARK_CLIENT_CLASSPATH=<fat jar full path>
Now I'm able to invoke:
dse spark-class <main Class>
The app crashes immediately because of classNotFound exception. It doesn't recognize internal classes of my app.
The only way I have been able to make it work has been to initialize the SparkConf as following:
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "cassandrahost")
.set("spark.cassandra.auth.username", "cassandra")
.set("spark.cassandra.auth.password", "cassandra")
.setJars(Seq("fat-jar-full-path"))
val sc = new SparkContext("spark://masterurl:7077", "DataGenerator", conf)
The method setJars enables to dispatch my jar to the cluster workers.
Is it the only way to accomplish that ? I thinks it's pretty ugly and not portable.
Is it possible to have an external configuration to set master url, cassandra host and app jar path?
I have seen that starting from Spark 1.0 there is the spark-submit command that allows to specify the app-jar externally. Is it possible to update spark to version 1.1 in DSE 4.5.3 ?
Thanks a lot
You can use Spark submit with DSE 4.6 which just dropped today (Dec 3rd, 2014) and includes Spark 1.1.
Here are the new features:
LDAP authentication Enhanced audit logging:
-Audit logging
-configuration is decoupled from log4j Logging to a Cassandra table
-Configurable consistency levels for table logging Optional
-asynchronous logging for better performance when logging to a table
Spark enhancements:
-Spark 1.1 integration Spark Java API support
-Spark Python API (PySpark) support Spark SQL support Spark Streaming
-Kerberos support for connecting Spark components to Cassandra DSE
Search enhancements:
-Simplified, automatic resource generation
-New dsetool commands for creating, reloading, and managing Solr core resources
-Redesigned implementation of CQL Solr queries for production usage
-Solr performance objects
-Tuning index size and range query speed
-Restricted query routing for experts
-Ability to use virtual nodes (vnodes) in Solr nodes. Recommended range: 64 to 256 (overhead increases by approximately 30%)
Check out the docs here:
http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/newFeatures.html
As usual you can download here with your credentials:
http://downloads.datastax.com/enterprise/opscenter.tar.gz
http://downloads.datastax.com/enterprise/dse-4.6-bin.tar.gz

org.jboss.netty.channel.ChannelPipelineException: Failed to initialize a pipeline

I have an application that connects to Cassandra using the Java Driver, fetches some configuration and based on the results generates and executes some PIG scripts.
Now, I am able to successfully connect to Cassandra, when jars required for PIG are not in the classpath. Similarly, I am able to launch PigServer class and execute scripts / statements using the entire DSE stack when I am not connecting to Cassandra using the java driver to retrieve the configuration.
When I use both of them I get following exception:
org.jboss.netty.channel.ChannelPipelineException: Failed to initialize a pipeline.
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:181)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:570)
... 35 more
Caused by: org.jboss.netty.channel.ChannelPipelineException: Failed to initialize a pipeline.
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:208)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:182)
at com.datastax.driver.core.Connection.<init>(Connection.java:100)
at com.datastax.driver.core.Connection.<init>(Connection.java:51)
at com.datastax.driver.core.Connection$Factory.open(Connection.java:376)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:207)
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:170)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:87)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:576)
at com.datastax.driver.core.Cluster$Manager.access$100(Cluster.java:520)
at com.datastax.driver.core.Cluster.<init>(Cluster.java:67)
at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:94)
at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:501)
I see others have seen similar exception, but when trying to execute Cassandra statements, from MapReduce tasks, which is not my case:
https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/FhW_8e4FyAI
http://www.datastax.com/dev/blog/the-native-cql-java-driver-goes-ga#comment-297187
Thanks!
DSE stacks connect to Cassandra through thrift API which is different from Cassandra Java Driver.
You can't use Cassandra Java driver for Pig/Hadoop before CASSANDRA-6311 is resolved.
There may be the bad security certificate/security certificate expiration issue if you are using certificate.

Resources