Spark Cassandra Connector query by partition key - apache-spark

What will be an ideal way to query cassandra by a partition key using the Spark Connector. I am using where to pass in the key but that causes cassandra to add ALLAOW FILTERING under the hood which in turn causes timeouts.
current set up :
csc.cassandraTable[DATA]("schema", "table").where("id =?", "xyz").map( x=> print(x))
here id is the partition(not primary) key
I have a composite primary key and using only the partition key for query
Update :
yes , I am getting an exception with this :
Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)
none of my partitions have more than a 1000 records and I am running a single cassandra node

ALLOW FILTERING is not going to affect your query if you use a where clause on the entire partition key. If the query is timing out it may mean your partition is just very large or the full partition key was not specified
EDIT:
Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)
Means that the your queries are being sent to machines which do not have a replica of the data you are looking for. Usually this means that the replication of the keyspace is not set correctly or that the connection host is incorrect. The LOCAL part of LOCAL_ONE means that the query is only allowed to succeed if the data is available on the LOCAL_DC.
With this in mind you have 3 options
Change the initial connection target of your queries
Change the replication of your keyspace
Change the consistency level of your queries
Since you only have 1 machine, Changing the replication of your keyspace is probably the right thing to do.

Related

Cassandra timeout during read query at consistency ALL

we use Cassandra 3.11.1, com.datastax.oss java-driver-core 4.13.0 and Java 13.
we have multiple read micro-services which read data from Cassandra, but got this error:
Cassandra timeout during read query at consistency ALL (8 responses were required but only 7 replica responded)
our queries are mostly just select by primary key, some of queries even specify to set consistency level to local_quorum, wonder why we still ran into this issue, sample queries:
QueryBuilder.selectFrom(Email.TABLE_NAME)
.all()
.whereColumn(Email.COLUMN_EMAIL_ADDRESS)
.isEqualTo(bindMarker())
.limit(bindMarker())
.build()
.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)
QueryBuilder.selectFrom(Account.TABLE_NAME)
.columns(
AccountDataObject.COLUMN_ACCOUNT_ID,
AccountDataObject.COLUMN_EMAIL_ADDRESS)
.whereColumn(Account.COLUMN_ACCOUNT_ID)
.isEqualTo(bindMarker())
.build()

Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded)

[![enter image description here][1]][1]Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded); nested exception is com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only
I am getting the above error for while executing a query on one of the Cassandra table of my application? The table has 3 columns promo ,store and upc. promo is the type = PrimaryKeyType.PARTITIONED & both store and upc are type = PrimaryKeyType.CLUSTERED. I am getting this for only one of the promo. How can I resolve this?
The exception means that the nodes in your cluster were unresponsive. You need to investigate why. Start by reviewing the system.log and debug.log on the nodes.
In a lot of cases, this is caused by nodes being overloaded and GC pauses. Cheers!

Cassandra result from running query cqlsh from one of the node cluster

I am new to cassandra and haven't finished read the doc yet.
Just want to know if I run cqlsh from one node of a 3 nodes cluster and run query
e.g.
cqlsh node1host -u username -p passwd -k my_cass_keyspace
> select ...
is the result come from all the 3 nodes or it is just result from the node that I run cqsh in?
Sorry for very noob question.
Thanks.
General answer : from all node.
Detailed answer :
Your node might be the coordinator, then depending on your replication factor, the node might fetch data from other nodes ( for example RF 1, then you query on a partition from another node).
This depends on your replication factor, and also your consistency level.
you can check your consistency when using cqlsh by (default is ONE) :
cqlsh> consistency;
Current consistency level is ONE.
You can change it by ( to QUORUM for example) :
cqlsh> CONSISTENCY QUORUM ;
If you want to know details about your request execution plan, try to activate tracing :
cqlsh> tracing on ;
I hope this helps !
The answer is Yes, it comes from all the nodes but it depends upon you cluster configurations, replication factor and consistency level.
For Example:- You have 3 nodes cluster and replication factor is 3 and consistency level is quorum for read and write both so whenever you will do insert query then your data will replicate to all 3 nodes but 2 nodes acknowledgement is sufficient to coordinator as quorum. same fill follow if you do a select query.
You may also refer Cassandra documentation as below:-
http://cassandra.apache.org/doc/latest/architecture/dynamo.html#replication

Cassandra clustering fail over-High Avialability

I have configured a cassandra clustter with 3 nodes
Node1(192.168.0.2) , Node2(192.168.0.3), Node3(192.168.0.4)
Created a keyspace 'test' with replication factor as 2.
Create KEYSPACE test WITH replication = {'class':'SimpleStrategy',
'replication_factor' : 2}
When I stop either Node2 or Node3 (one at a time and both at one time) , I am able to do the CRUD operations on the keyspace.table.
When I stop Node1 and try to update/create a row from Node4 or Node3, getting following error although Node3 and Node4 are up and running-:
All host(s) tried for query failed (tried: /192.168.0.4:9042
(com.datastax.driver.core.exceptions.DriverException: Timeout while
trying to acquire available connection (you may want to increase the
driver number of per-host connections)))
com.datastax.driver.core.exceptions.NoHostAvailableException: All
host(s) tried for query failed (tried: /192.168.0.4:9042
(com.datastax.driver.core.exceptions.DriverException: Timeout while
trying to acquire available connection (you may want to increase the
driver number of per-host connections)))
I am not sure how Cassandra elects a leader if a leader node dies.
So, you are using replication_factor 2, so only 2 nodes will have a replica of you keyspace (not all the 3 nodes).
My first advise is to change the RF to 3.
You have to pay attention to the consistency level you are using; If you have only 2 copies of you data (RF: 2), and you are using Consistency Level QUORUM, it will try to write the data on half of nodes + 1, in this case, all 2 nodes. So if 1 node is down, you will not be able to write/read data.
to verify where the data is replicated you could see how is the ring in you cluster. As you are using SimpleStrategy it will copy the data clockwise direction. And in your case its copied at nodes at 192.168.0.2 and 192.168.0.3.
Take a look at the concepts of replication factor: http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/architecture/architectureDataDistributeReplication_c.html
And Consistency Level: http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
Great answer about RF vs CL: https://stackoverflow.com/a/24590299/6826860
You can use this calculator to find out if your setup have a decent consistency. In your case the result is You can survive the loss of no nodes without impacting the application
I think I wasn't clear at response. The replication factor is about how many copies of your data will exists. The consistency level is how many copies your client will wait to be made before get an response from server.
Ex: All your nodes are up. The client make a CQL with CL Quorum, the server will copy the data in 2 nodes (3/2 + 1) and reply to client, in background it will copy the data at the third node as well.
In your example, if you shutdown 2 nodes of a 3 node cluster you will never achieve an QUORUM to make requests (with CL QUORUM), so you have to use consistency level ONE, once the nodes are up again, cassandra will copy the data on them. One thing that can happen is: before cassandra copy the data on other 2 nodes, the client make a request for node1 or node2 and the data is not there yet.

Cassandra read timeouts on AWS

TL;DR: I'm using Cassandra. I'm making tests to see if it will handle the load but I get lots of timeouts when reading the data.
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
I have set up a Cassandra cluster on Amazon AWS: 8 m4.xlarge instances with 2 EBS drives - type 'gp2' - of 100 GB each (commit log on one drive, the rest of the data on the other one). The instances are in the same availability zone, in a VPC. I'm using the stock version of Apache Cassandra 3.7 with no specific tuning of the servers or of Cassandra itself.
I have loaded 1 Billion records. Each of them has about 30 fields. Primary key is made of 2 partition keys and one clustering column. I have about 10 records per partition key. Replication factor is 3. Each of the 8 nodes stores about 40 GB of data after compaction.
My test consists in making 1000 queries on random keys with a basic Scala application using the Datastax Cassandra driver. The WHERE clause contains the partition key and I read all the records, i.e. the WHERE clause does not include the clustering column.
When the queries are sequential, all the queries return expected results and the average response time is 74 ms.
When I use async queries, make all the queries at once and call get() on the Futures, I get many timeouts after 5 seconds (between 25% and 75% of the queries fail).
I assumed the EBS drives might be throttled and I tried with a different cluster: 3 nodes of type i2.xlarge with data stored on the ephemeral drives.
Notice that, during my tests, the compaction had stopped doing its job. I did not see the Garbage Collector kicking in during the queries.
Any idea why the queries are generating time outs?
When I use async queries, make all the queries at once and call get() on the Futures, I get many timeouts after 5 seconds (between 25% and 75% of the queries fail).
Did you throttle your async queries ? How many select did you send to the cluster, asynchronously ?

Resources