stubbed cassandra for data storage - cassandra

I need an embedded cassandra for my project and I was wondering if I can use Stubbed Cassandra for data storage. Because I need a system to simulate CQL requests and responses.
Thanks everyone.

You cant use it as a real datastore. Use real cassandra as a real cassandra datastore. check out ccm which is probably more what your looking for.
There are wrappers for it in dtests (python) and the java driver uses it for testing and has a java wrapper.

I don't really have any experience at all with SCassandra but I worked on several projects using Apache Cassandra and there are some use cases like multidatacenter infrastructure to experiment and I don't think SCassandra can do it. So if you plan to do simple tests, that's fine, But advanced use cases really need to be tested in a real cassandra distribution.

As others have mentioned, you will need the real Cassandra for data storage. However, if you want to test CQL requests/responses then you can use this library:
Cassandra-Spy
It runs an actual embedded Cassandra and also can simulate failures for inserts/selects. This helps you test your app's behaviour in failure cases. I wrote the library to address this specific use case.

Related

How to connect to Flink SQL Client from NodeJS?

I'm trying to use Apache Flink's Table concept in one of my projects to combine data from multiple sources in real-time. Unfortunately, all of my team members are Node.JS developers. So, I'm looking for possible ways to connect to Flink from NodeJS and query from it. In Flink's documentation for SQL Client, it's mentioned that
The SQL Client aims to provide an easy way of writing, debugging, and submitting table programs to a Flink cluster without a single line of Java or Scala code. The SQL Client CLI allows for retrieving and visualizing real-time results from the running distributed application on the command line.
Based on this, is there any way to connect to Flink's SQL client from NodeJS? Is there any driver already available for this like Node.JS drivers for MySQL or MSSQL. Otherwise, what are the possible ways of achieving this?
Any idea or clarity on achieving this would be greatly helpful and much appreciated.
There's currently not much that you can do. The SQL Client runs on local machines and connects to the cluster there. I think what will help you is the introduction of the Flink SQL Gateway, which is expected to be released with Flink 1.16. You can read more about that on https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Gateway
Another alternative is to check out some of the products that offer a FlinkSQL editor on the market, maybe that is useful path for your colleagues.
For example:
https://www.ververica.com/apache-flink-sql-on-ververica-platform
https://docs.cloudera.com/csa/1.7.0/ssb-overview/topics/csa-ssb-intro.html
Note that this is not exactly what you asked for, but could be an option to enable your team.

Is there a simple Jmeter performance test case for Cassandra

We are creating Jmeter performance benchmarking for our Cassandra installation.
For which we have been referring to the default Cassandra plugin mentioned in the site
This plugin does not take any Cassandra server connection parameter for the "put", no much help is also present to how to use this plugin.
Some can help me with this plugin if any one knows how to configure Cassandra connection
Hence we switched to an article to test Cassandra with Groovy. (Link here)
This site calls to add multiple jar some are bundles and cannot find the exeat JAR
snappy-java-1.0.5
netty-transport-4.0.33.Final
netty-handler-4.0.33.Final
netty-common-4.0.33.Final
netty-codec-4.0.33.Final
netty-buffer-4.0.33.Final
metrics-core-3.1.2
lz4-1.2.0
HdrHistogram-2.1.4
guava-16.0.1
Can some help me with some simpler test perform on Cassandra ?
For correct performance testing of Cassandra it's better to use specialized tools, like NoSQLBench that was developed specifically for that task. Generic tools won't give you the real performance numbers. Please read NoSQLBench documentation on how to correctly test Cassandra to take into account things like compaction, repairs, etc.
Have you tried to read documentation which mentions CassandraProperties configuration element where you can define your connection server parameters:
If you want to have the full control and not only be limited to what other guys implemented you can consider following instructions from Cassandra Load Testing with Groovy article

Best way to benchmark Cassandra and Hbase for performance?

What's the best way to benchmark Cassandra and Hbase for performance?
I'm working on an application where the Read (80%) and Write (20%) usage through an web application. Users can also do CRUD (Create, Read, Update, Delete) to the data. Our data is all structured from (RDBMS). I have heard about YCSB (Yahoo! Cloud Serving Benchmark).
Had anyone done benchmark on Cassandra vs Hbase for a similar usecase like above?
I will assume that your Cassandra is sitting behind a web app?
If so (as you mentioned CRUD), just benchmark the end points of your CRUD for WRITE (the Create) and the READ via Apache Workbench or Siege under load (ie concurrent calls, etc..)
Update
If you want to purely test if your configuration of Cassandra is correct for raw power:
http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html
but if you want to test the application as a whole, Apache workbench and Siege will test your App.
Most of the databases provide some tool to do performance testing. In my opinion, the best way to get an unbiased view is to use a third party tool like https://github.com/brianfrankcooper/YCSB which supports testing different types of ACID and NoSQL databases.

Is there an alternative to use Cassandra without PHP-driver?

I know about the Cassandra PHP driver being in BETA. But I strongly believe DataStax offers the best solutions as far as PHP drivers go for Cassandra.
What I would love to try is to see if it's possible to get cassandra data into my PHP application using CQLSH and PHP shell commando's. (both Cassandra and PHP script run on the same server. )
Anyone ever tried this?
Would there be a method to get CQLSH return json or a different output instead of columns fit for my console?
Thanks for your insights.
cqlsh is built using the DataStax python driver. That being said, I would not recommend system calls to cqlsh using OS system calls from php. Not only is it impractical from a data format perspective, it is also hacky, I would not expect it to perform well, and it would be adding a lot of complexity and failure scenarios to your application.
For scalability, if you ever need to move your application to a different machine, you would not be able to. These are just a few of the downsides that I can think of from the top of my head.
You are better off using the beta PHP driver from DataStax or waiting for a stable version. RC1 is due to drop soon.

CouchDB in-memory implementation

Is there a mock backend for CouchDB, i.e. same REST interface and semantics but purely in-memory? We have a testsuite that runs each test on a pristine database every time (to be reproducible), but running against real database could be faster.
Do you mean running against a mock database?
I do not think there is something right out of the box. Two ideas:
CouchDB on a memory filesystem. Set up a ramdisk, or tmpfs mount, and configure the CouchDB database_dir and view_index_dir to point to there.
PouchDB is porting CouchDB to the browser IndexedDB standard. You did not say which language and environment you are using, but if you can run Node.js, this might be worth looking into. PouchDB has good momentum and I think it will be running in Node.js soon (perhaps through jsdom or some other library. Note, this does not get you the full solution; but you have expanded your question to "are there in-memory IndexedDB implementations for Node.js" for which the answer is either "yes" or "soon," given its adoption trajectory.
Found this: https://github.com/RipcordSoftware/AvanceDB - it supports different platforms and seems to be a serious effort.
Rather late to the party, but I've had great success using pouchdb-server, based on the aforementioned PouchDB project (a JavaScript implementation of CouchDB). It can run against a variety of back-ends, including an in-memory back-end. That means you can run
pouchdb-server --in-memory
to get an in-memory CouchDB-compatible server. There's several other command-line options to explore, too.
I think it is able to run the entire CouchDB test suite, so I'd guess it is fairly unlikely you'd run into too many implementation differences.
I have the same problem... for tests i just don't want to setup a couchdb... i just want to have some memory stuff, as easy as possible.
What did i do:
* I create a memory CouchDB Connector => it's just a very simple implementation of "org.ektorp.CouchDbConnector"
* By spring i wire the CouchDbConnection-Implementation which i need => when i use it for my dev-tests i wire my memory CouchDB Connector, if i want to connect to a real CouchDb i use the usual connector => org.ektorp.impl.StdCouchDbConnector
The only problem is, that "org.ektorp.CouchDbConnector" has more than 50 methods, which must be implemented. For my issues it was enough to implemented just a few of these methods. Depends on your testcases.
memorydb is a partial (in-progress) in-memory implementation of CouchDB to be used with Kivik, which can be run as a stand-alone server.
Not all functionality is implemented yet.

Resources