Not enough replicas available for query at consistency LOCAL_ONE (1 required but only 0 alive) - apache-spark

I am running spark-cassandra-connector and hitting a weird issue:
I run the spark-shell as:
bin/spark-shell --packages datastax:spark-cassandra-connector:2.0.0-M2-s_2.1
Then I run the following commands:
import com.datastax.spark.connector._
val rdd = sc.cassandraTable("test_spark", "test")
println(rdd.first)
# CassandraRow{id: 2, name: john, age: 29}
Problem is that following command gives an error:
rdd.take(1).foreach(println)
# CassandraRow{id: 2, name: john, age: 29}
rdd.take(2).foreach(println)
# Caused by: com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency LOCAL_ONE (1 required but only 0 alive)
# at com.datastax.driver.core.exceptions.UnavailableException.copy(UnavailableException.java:128)
# at com.datastax.driver.core.Responses$Error.asException(Responses.java:114)
# at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:467)
# at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1012)
# at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:935)
# at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
And the following command just hangs:
println(rdd.count)
My Cassandra keyspace seems to have the right replication factor:
describe test_spark;
CREATE KEYSPACE test_spark WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
How to fix both the above errors?

I assume you hit the issue with SimpleStrategy and multi-dc when using LOCAL_ONE (spark connector default) consistency. It will look for a node in the local DC to make the request to but theres a chance that all the replicas exist in a different DC and wont meet the requirement. (CASSANDRA-12053)
If you change your consistency level (input.consistency.level to ONE) I think it will be resolved. You should also really consider using the network topology strategy instead.

Related

Removed DC2 from replication but still able query date from DC2 nodes

I am trying to remove a DC2 from my Cassandra Cluster. For this I start with altering replication factor from 2 to 0 in DC2. I try to insert a row in DC1 node1 and I am still receiving this row while queried from DC2 nodes.
Why is this happening?
I'm assuming that you're querying the data with cqlsh. By default, it uses a consistency of ONE so it will query any replica. In your case, they all happen to be in DC1.
If you try to query with a local consistency then you will probably get the result (or lack of) which I think you're expecting.
As a side note, although setting replication to 0 is technically valid, it is more customary to simply remove a DC completely from replication so you end up with:
ALTER KEYSPACE some_ks WITH REPLICATION = { \
'class' : 'NetworkTopologyStrategy', \
'DC1' : 3
}

Why does data become inaccessible on Cassandra when it is scaled out with replication factor 1

I was experimenting a scale out operation with Cassandra, and noticed that the data becomes inaccessible (disappears from client PoV) when you horizontally scale with replication factor (RF) set to 1.
Before scale out:
2 nodes cluster (1 seed + 1 non-seed)
replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}
After scale out:
4 nodes cluster (3 seeds + 1 non-seed)
replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}
In both cases, I tried writing/reading data using cqlsh with CONSISTENCY ONE.
Also checked the backup data, and written data actually still exists after scale out i.e. it is not a data loss. Setting RF=3 somehow made the data available again.
I understand that this is not an ideal setup anyway but why can't cqlsh read an existing replica?
Version info in case it matters:
[cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]

Spark Cassandra Issue with KeySpace Replication

I have created table in Cassandra with below commands:
CREATE KEYSPACE test WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', 'dc1' : 3 } AND DURABLE_WRITES = false;
use test;
create table demo(id int primary key, name text);
Once the table got created successfully, I was running the below code to write the data into Cassandra from Spark.
But facing below error
Code Snippet of Spark
import com.datastax.spark.connector._
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import com.datastax.spark.connector.cql._
val connectorToClusterOne = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host","xx.xx.xx.xx").set("spark.cassandra.auth.username", "xxxxxxx").set("spark.cassandra.auth.password", "xxxxxxx"))
---K/V---
val data = sc.textFile("/home/ubuntu/test.txt").map(_.split(",")).map(p => demo(p(0).toInt,p(1)))
implicit val c = connectorToClusterOne
data.saveToCassandra("test","demo")
BELOW IS THE ERROR DESCRIPTION: .
Error while computing token map for keyspace test with datacenter dc1: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
Could any one suggest what could be the possible reason for this.
This error is usually because either the request is not being directed at the correct cluster or the datacenter does not exist or has an incorrect name.
To make sure you are connecting to the correct cluster double check the connection host used for your spark application.
To check the datacenter, use nodetool status to make sure that the datacenter which you requested exists and includes no extraneous whitespace.
Lastly, it could be possible that all the nodes in the datacenter are down, so double check this as well.

Cassandra: Insert fails for consistency level "Quorum"

We get an error "cannot achieve consistency level QUORUM" (details below)
in following configuration:
Two datacenters with 6 nodes each, all nodes on same rack.
It works when CL is set as "Local Quorum".
Basically, as far as we use consistency level that require cross DC consistency, it fails to insert data. "Nodetool status" command shows that all 12 nodes are up and running.
What can be wrong?
Your help is much appreciated!
Thanks
Dimitry
Keyspace
CREATE KEYSPACE test6 WITH replication = {'class': 'NetworkTopologyStrategy', 'C
entralUS': '3', 'EastUs': '3'} AND durable_writes = true;
Query
INSERT INTO glsitems (itemid,itemkey) VALUES('1', 'LL');
Error
cassandra-driver-2.7.2\cassandra\cluster.py", line 3347, in result
raise self._final_exception
Unavailable: code=1000 [Unavailable exception] message="Cannot achieve
consistency level QUORUM" info={'required_replicas':
4, 'alive_replicas':3, 'consistency': 'QUORUM'}
It could be that Cassandra thinks all nodes are in the same datacenter. In this case LOCAL_QUORUM would always work properly but not QUORUM.
Did you correctly configure the snitch ?
Snitch – For multi-data center deployments, it is important to make
sure the snitch has complete and accurate information about the
network, either by automatic detection (RackInferringSnitch) or
details specified in a properties file (PropertyFileSnitch). link
You can find which snitch is used in the cassandra yaml file, property endpoint_snitch.
Here is the datastax documentation about existing snitches with Cassandra 2.0.

Not enough replica available for query at consistency LOCAL_ONE (1 required but only 0 alive)

I have 6 nodes in my cassandra cluster, all the nodes are up. My keyspace is set up as:
replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true
Doing a read from this cass cluster gives me the error:
java.io.IOException: Exception during execution of SELECT "colA", "colB", "colC", "colD" FROM "keyspacename"."tablename" WHERE token("colA") > ? AND token("colA") <= ? LIMIT 1 ALLOW FILTERING: Not enough replica available for query at consistency LOCAL_ONE (1 required but only 0 alive)
All my nodes are up with a replication factor of 1.. then what is causing this problem?
Also I can cqlsh and do "select" and "insert" in this table.
Anyone know whats going on?
SimpleStrategy: Use for a single data center only. If you ever intend more than one data center, use the NetworkTopologyStrategy.
Or try running the same query with Consistency ONE, ALL, ANY, QUORUM. I mean not with LOCAL_*

Resources