Does Stargate REST API support Cassandra batch query? - cassandra

Does Stargate rest api support batch query in Cassandra (DSE 6.8)? If any, can you provide me an example?

Batch queries are not supported at this time but we do have the following issue for adding the functionality
https://github.com/stargate/stargate/issues/821

Related

Presto connector for druid fails to identify tables with uppercase names

When trying to query druid tables with uppercase names, the query fails with the error: Table 'TABLE_NAME' does not exist. A similar issue was observed in MySql connector and the attribute option "case-insensitive-name-matching" was added for MySql connector catalogue file. I have tried using the same attribute for Druid catalogue and it doesn't seem to be working.
My guess is you're using Facebook's version of Presto.
TL;DR
You need to use Trino Druid Connector to have support for case-insensitive-name-matching. Trino is formerly known as Presto SQL.
long version
case-insensitive-name-matching was added first in Presto SQL (i am the author of this code, BTW) and later backported to Facebook's Presto, but apparently does not apply to their Druid connector. Trino Druid Connector (fka Presto SQL's Druid Connector) does not have this limitation. You can use either Presto 350 (before project rename) or Trino 351 (after rename).

Time Series Visualization from Cassandra

I have a Cassandra Database and a Spark cluster that will get his inputs from Cassandra to do some processing.
In my Cassandra database, I have some table that are time series. I am looking for a way to visualize those time series easily without multiplying databases.
Grafana is a great tool for that, but infortunately, it seems like there is no way to plug it to Cassandra.
So, for now I am using Zeppelin notebooks using my Cassandra/Spark cluster, but the available features to display time series aren't as good as those from Grafana.
I also cannot replace my Cassandra by InfluxDB, because my Cassandra is not used only for time series storing.
Unfortunately there is no direct plugin for Cassandra as a datasource for Grafana. Below are the different possible ways you can achieve Cassandra to Grafana integration.
There is a pull request for Cassandra as a datasource https://github.com/grafana/grafana/pull/9774, this is not merged to Grafana master branch though.
you could run a fork of Grafana with this PR and use the plugin.
You can use KairosDB on top of Cassandra (We can configure KairosDB to use Cassandra as a Datastore, so no multiple databases:) and use KairosDB plugin. but this approach has some drawbacks:
we need to map the Cassandra schema to KairosDB schema, KairosDB
schema is metrics based schema.
Though KairosDB uses cassandra as
a Datastore, it will store the data in different schema and table, so
data is duplicated.
If your app is writing data to Cassandra
directly, you need to write simple client to pull the latest data
from cassandra and push to KairosDB
You can implement the SimpleJSON plugin for Grafana (https://github.com/grafana/simple-json-datasource). There are lots examples available for SimpleJSON implementation, write one for Cassandra and opensource :)
You can push the data ElasticSearch and use it as a Datasource. ES is supported as a Datasource for all major visualization tools.
A bit too late but there is a direct integration now, Cassandra datasource for Grafana
https://github.com/HadesArchitect/GrafanaCassandraDatasource
I would suggest to use Banana Visualization, but for this Solr should be enabled on Timeseries Table. Banana is a forked version of KIBANA. Also has powerful dashboard configuration capabilities.
https://github.com/lucidworks/banana

Spring Cassandra vs. Astyanax performance

I am trying to evaluate the performance of Astyanax and Spring Cassandra. However I did write up a program to measure insertion and read time. It turned out that with large data Astyanax showed up to 600 times faster insertion rate than Spring Cassandra. I believe Spring Cassandra uses datastax driver to communicate with Cassandra though Astyanax uses thrift. Can anyone who have much knowledge about Cassandra client APIs give me more information on their performance analysis? Is anything appearing wrong in my analysis?
Astyanax and the Thrift protocol are deprecated in Cassandra. Netflix, who contributed Astyanax, has ceased all new development in favor of the Datastax Java driver.
SDC* uses the Datastax Java Driver, which uses the latest protocol, and is very fast in the production emvironments I have deployed into.
Without your test, it is impossible to tell you why you are seeing what you are seeing.
Are you testing reads or writes?
Are you using the spring-data-cassandra or spring-cql module?
Are you explicitly setting the ConsistencyLevel in your SDC* tests?
Which methods of the template or repository are you using for your test.
We can perform 10K writes per second PER NODE in a C* cluster using the DS java driver.

Is it possible to execute SOLR 4.0 spatial queries from CQL against DSE Search 3.2.0 instance?

Is it possible to execute SOLR 4.0 spatial queries from CQL against DSE Search 3.2.0 instance? If yes, what is the correct syntax? In particular my question is about CQL queries referring to a field of a type implemented using solr.SpatialRecursivePrefixTreeFieldType class. Running SOLR queries referring to this type against DSE Search 3.2.0 instance using SOLR Web console works just fine.
Thanks,
Leon
DSE Search CQL only supports basic Lucene syntax, and at this stage it is only provided for development/testing purposes, so you're encouraged to use standard Solr APIs.
Just to be clear about usage, normally a "spatial" query is a filter query that is applied to a main query to limit the area of the results while the main query does selection of data by non-spatial attributes such as keywords, but the CQL syntax has only a main query and no provision for any additional Solr query parameters such as filter queries ("fq") or the Solr parameters used by spatial queries.
So, the Solr HTTP API is the only route for spatial queries against DSE data.

Differences betweeen Hector Cassandra and JDBC

I'm currently starting a project that use Cassandra Apache. So I'm interesting in accessing to my database cassandra from Java. For that, I'm using Hector Cassandra. However, I've some doubts about what's the differences between the access via Hector or JDBC Cassandra (specifically this: https://code.google.com/a/apache-extras.org/p/cassandra-jdbc/).
I believe the following (although I not sure if I'm right):
one difference between both could be that are API of different level (I consider that Hector Cassandra is an API of higher-level than JDBC Cassandra)?
in JDBC Cassandra is used CQL for accessing/modifying the database, while Hector Cassandra don't use CQL (only use the methods provided for that).
I'll be thankful if someone can help me and tell me if I'm right/wrong in the previous lines and more differences between both (Hector and JDBC Cassandra).
Thank in advance!
Official Cassandra Java Driver (https://github.com/datastax/java-driver) is probably the best (IMHO, the only) choice for a new project for several reasons:
New features
All other Cassandra clients (Hector, Astyanax, etc) are based on legacy Thrift RPC protocol. RPC "One response per one request" model has severe limitations, for example it doesn't allow processing several requests at the same time in a single connection or streaming large ResultSets.
So, DataStax developed a new protocol that doesn't have RPC limitations. Thrift API won't be getting new features, it's only kept for backward-compatibility. In contrast, Java Driver is actively developed to incorporate the new features of Cassandra 2.0, like conditional updates, batching prepared statements, etc. The overview of new features is here: http://www.datastax.com/dev/blog/cql-in-cassandra-2-0
Convenience
In early Cassandra days (0.7) in our company we have used in-house low-level Thrift client. Later on we have used Hector, Pelops and Astyanax in various projects. I can say that the clients based on Java Driver look the most simple and clean to me.
Performance
We have made some performance testing of Cassandra Java Driver vs other clients. In most scenarios the performance is roughly the same. However, there are certain situations when Cassandra Java Driver significantly outperforms other clients due to its asynchronous nature.
Btw, there's a couple of related questions with excellent answers:
Advantages of using cql over thrift
Cassandra Client Java API's
EDIT: When I wrote this, I wasn't aware that Achilles (https://github.com/doanduyhai/Achilles) mentioned in another answer has CQL implementation that works via Java Driver. For the same of completeness I must say that Achilles' DAO on top of CQL might be (or might became one day) viable alternative to plain CQL via Java Driver.
#mol
Why do you restrict to Hector and cassandra-jdbc if you're starting a new project ?
There are many other interesting choices:
Astyanax as Martin mentioned (Thrift & CQL3)
FireBrand (Thrift via Hector)
Achilles I've just developed (CQL3 & Cassandra 2.0 via Java driver core)
Java Driver Core for plain CQL3
Hector is indeed a higher-level API. Internally it will use Cassandra's Thrift API to execute its functions. It will not convert them to equivalent CQL calls. But its API also provides access to CQL. In this case it will pass the CQL (via Thrift) to Cassandra's APIs for CQL.
CQL in Cassandra is a SQL-like language that works via the Cassandra APIs. So it does not provide any additional capability in the use of Cassandra than the APIs but does make it easier at times to use. If you are considering using Hector I would also look at Astyanax which is a newer take on a high-level Java API to Cassandra.
Since you are starting a new project, it is best to start with CQL as Java native driver:
http://www.datastax.com/documentation/developer/java-driver/1.0/webhelp/index.html#common/drivers/introduction/introArchOverview_c.html
Per DataStax, it is 10-15% faster than Thrift APIs, as it uses Binary Protocol.

Resources