Error during inserting data: NoHostAvailable: - cassandra

I try to learn basics of Apache Cassandra. I found this simple example of application at https://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_music_service_c.html
So I created a keyspace, then I created a table, and now I am trying to add some data to the database.
But when I try to Insert data I have got an error: "NoHostAvailable:" That's it. No more information.
So far I've tried to update python driver (NoHostAvailable exception connecting to Cassandra from python) but it didn't work.
What I do wrong? Or is it a problem with cqlsh?

OK. I've found the answer. The NetworkTopologyStrategy is not suited for running on a single node. After changing replication strategy on SimpleStrategy everything started to work.

Just met same problem. check the keyspace's replication setting, if using NetworkTopologyStrategy, ensure the dc name is correct.

To change replication strategy (from NetworkTopologyStrategy) to SimpleStrategy (which is proper for single node) run the following query:
ALTER KEYSPACE yourkeyspaceName WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} ;

For me, it happened as one of the instance went down. Restarted the second instance and error gone. But in the schema table, I am seeing the Topology as Simple for my keyspace. That is confusing.

Let's clear the air here...
You ABSOLUTELY can use NetworkTopologyStrategy in a single-node configuration. I currently have five versions of Cassandra installed on my local, and they are all configured that way, and they work just fine.
Albeit, it is not as simple as just using SimpleStrategy, so there are some steps that need to be taken:
Start by setting the GossipingPropertyFileSnitch in the cassandra.yaml:
endpoint_snitch: GossipingPropertyFileSnitch
That tells Cassandra to use the cassandra-rackdc.properties file to name logical data centers and racks:
$ cat conf/cassandra-rackdc.properties | grep -v "#"
dc=dc1
rack=rack1
If you have a new cluster, you can change those. If you have an existing cluster, leaving them is the best idea. But you'll want to reference the dc name, because you'll need that in your keyspace definition.
Now, if you define your keyspace like this:
CREATE KEYSPACE stackoverflow WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1'};
With this configuration, NetworkTopologyStrategy can be used just fine.
Opinions will differ on this, but I do not recommend using SimpleStrategy. It's a good idea to practice getting used to using NetworkTopologyStrategy on your local. I say this, because I have seen the opposite happen: folks accidentally deploy a SimpleStrategy keyspace into a MDHA cluster in production, and then wonder why their configured application consistency cannot be met.

It happens some times when the node is down who is having the desired partition ranges for that insert. Please check your all nodes from below commands and try to run query again.
nodetool status
nodetool describecluster.

Related

system_auth replicates without changing replication_factor, so why change it?

I have a PasswordAuthenticator login, user = cassandra, with system_auth RF = 1 and 2 non-seed nodes. When I changed the password, it propagated to the non-seed nodes even though RF = 1. So why change the RF? (The reason I ask is: I noticed if I change it to 3 before the other nodes are up --> I can't login: QUORUM error, so this is a bootstrapping question)
cassandra#cqlsh:system_auth> SELECT * FROM system_schema.keyspaces;
keyspace_name | durable_writes | replication
--------------------+----------------+---------------------------------------------------------------------------------------
system_auth | True | {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'}
Two problems can happen if you don't change it:
In a multi-node cluster, if the node responsible for your user/role data crashes, you won't be able to log in.
In a multi-DC cluster, if you attempt to connect with your application using a "local" data center, you will only be able to connect to nodes in the DC containing your user/role data. This is because SimpleStrategy is not DC aware. Not changing it will also only store one replica in the cluster, which will only be in a single DC. Login attempts on the other DCs will fail.
In short, I recommend:
Never use SimpleStrategy. It's not useful for production MDHA (multi-datacenter high availability), so why build a lower environment with it?
Set the RF on system_auth to number of nodes, but no more than 3.
Creating an additional super user, and changing the password of the cassandra/cassandra user...never using it again. The cassandra/cassandra user triggers some special things inside of cqlsh to operate at QUORUM, by default. If nodes crash, you don't want the extra problems which that opens the door for.
Here's a related answer, which should help: Replication Factor to use for system_auth

Alter Keyspace on cassandra 3.11 production cluster to switch to NetworkTopologyStrategy

I have a cassandra 3.11 production cluster with 15 nodes. Each node has ~500GB total with replication factor 3. Unfortunately the cluster is setup with Replication 'SimpleStrategy'. I am switching it to 'NetworkTopologyStrategy'. I am looking to understand the caveats of doing so on a production cluster. What should I expect?
Switching from mSimpleStrategy to NetworkTopologyStrategy in a single data center configuration is very simple. The only caveat of which I would warn, is to make sure you spell the data center name correctly. Failure to do so will cause operations to fail.
One way to ensure that you use the right data center, is to query it from system.local.
cassdba#cqlsh> SELECT data_center FROM system.local;
data_center
-------------
west_dc
(1 rows)
Then adjust your keyspace to replicate to that DC:
ALTER KEYSPACE stackoverflow WITH replication = {'class': 'NetworkTopologyStrategy',
'west_dc': '3'};
Now for multiple data centers, you'll want to make sure that you specify your new data center names correctly, AND that you run a repair (on all nodes) when you're done. This is because SimpleStrategy treats all nodes as a single data center, regardless of their actual DC definition. So you could have 2 replicas in one DC, and only 1 in another.
I have changed RFs for keyspaces on-the-fly several times. Usually, there are no issues. But it's a good idea to run nodetool describecluster when you're done, just to make sure all nodes have schema agreement.
Pro-tip: For future googlers, there is NO BENEFIT to creating keyspaces using SimpleStrategy. All it does, is put you in a position where you have to fix it later. In fact, I would argue that SimpleStrategy should NEVER BE USED.
so when will the data movement commence? In my case since I have specific rack ids now, so I expect my replicas to switch nodes upon this alter keyspace action.
This alone will not cause any adjustments of token range responsibility. If you already have a RF of 3 and so does your new DC definition, you won't need to run a repair, so nothing will stream.
I have a 15 nodes cluster which is divided into 5 racks. So each rack has 3 nodes belonging to it. Since I previously have replication factor 3 and SimpleStrategy, more than 1 replica could have belonged to the same rack. Whereas NetworkStrategy guarantees that no two replicas will belong to the same rack. So shouldn't this cause data to move?
In that case, if you run a repair your secondary or ternary replicas may find a new home. But your primaries will stay the same.
So are you saying that nothing changes until I run a repair?
Correct.

Cassandra Error: "Unable to complete request: one or more nodes were unavailable."

I am a complete newbie at Cassandra and am just setting it up and playing around with it and testing different scenarios using cqlsh.
I currently have 4 nodes in 2 datacenters something like this (with proper IPs of course):
a.b.c.d=DC1:RACK1
a.b.c.d=DC1:RACK1
a.b.c.d=DC2:RACK1
a.b.c.d=DC2:RACK1
default=DCX:RACKX
Everything seems to make sense so far except that I brought down a node on purpose just to see the resulting behaviour and I notice that I can no longer query/insert data on the remaining nodes as it results in "Unable to complete request: one or more nodes were unavailable."
I get that a node is unavailable (I did that on purpose), but isnt one of the main points of distributed DB is to continue to support functionalities even as some nodes go down? Why does bringing one node down put a complete stop to everything?
What am I missing?
Any help would be greatly appreciated!!
You're correct in assuming that one node down should still allow you to query the cluster, but there are a few things to consider.
I'm assuming that "nodetool status" returns the expected results for that DC (i.e. "UN" for the UP node, "DN" for the DOWNed node)
Check the following:
Connection's Consistency level (default is ONE)
Keyspace replication strategy and factor (default is Simple, rack/dc unaware)
In cqlsh, "describe keyspace "
Note that if you've been playing around with replication factor you'll need to run a "nodetool repair" on the nodes.
More reading here
Is it possible that you did not set the replication factor on your keyspace with a value greater than 1? For example:
CREATE KEYSPACE "Excalibur"
WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2};
Will configure your keyspace such that data is replicated to 2 nodes in each dc1 and dc2 datacenters.
If your replication factor is 1 and a node goes down that owns the data you are querying you will not be able to retrieve the data and C* will fail fast with an unavailable error. In general if C* detects that the consistency level cannot be met to service your query it will fail fast.

Not enough replica available for query at consistency ONE (1 required but only 0 alive)

I have a Cassandra cluster with three nodes, two of which are up. They are all in the same DC. When my Java application goes to write to the cluster, I get an error in my application that seems to be caused by some problem with Cassandra:
Caused by: com.datastax.driver.core.exceptions.UnavailableException: Not enough replica available for query at consistency ONE (1 required but only 0 alive)
at com.datastax.driver.core.exceptions.UnavailableException.copy(UnavailableException.java:79)
The part that doesn't make sense is that "1 required but only 0 alive" statement. There are two nodes up, which means that one should be "alive" for replication.
Or am I misunderstanding the error message?
Thanks.
You are likely getting this error because the Replication Factor of the keyspace the table you are querying belongs to has a Replication Factor of one, is that correct?
If the partition you are reading / updating does not have enough available replicas (nodes with that data) to meet the consistency level, you will get this error.
If you want to be able to handle more than 1 node being unavailable, what you could do is look into altering your keyspace to set a higher replication factor, preferably three in this case, and then running a nodetool repair on each node to get all of your data on all nodes. With this change, you would be able to survive the loss of 2 nodes to read at a consistency level of one.
This cassandra parameters calculator is a good reference for understanding the considerations of node count, replication factor, and consistency levels.
I hit this today because the datacenter field is case sensitive. If your dc is 'somedc01' this isn't going to work:
replication =
{
'class': 'NetworkTopologyStrategy',
'SOMEDC01': '3' # <-- BOOM!
}
AND durable_writes = true;
Anyway, it's not that intuitive, hope this helps.
in my case, I got a message 0 available, but cassandra was up and cqlsh worked correctly, the problem was accessing from java: query was for a complete table, and some records were not accesible (all nodes containing them down). From cqlsh, select * from table works, only shows accesible records. So, the solution is to recover down nodes, and maybe to change replication factors with:
ALTER KEYSPACE ....
nodetool repair -all
then nodetool status to see changes and cluster structure
For me it was that my endpoint_snitch was still set to SimpleSnitch instead of something like GossipingPropertyFileSnitch. This was preventing the multi-DC cluster from connecting properly and manifesting in the error above.

Cassandra keyspace for counters

I am trying to create a table for keeping counters to different hits on my APIs. I am using Cassandra 2.0.6, and aware that there have been some performance improvements to counters starting 2.1.0, but cant upgrade at this moment.
The documentation i read on datastax always starts with creating a separate keyspace like these:
http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_counter_t.html
http://www.datastax.com/documentation/cql/3.1/cql/cql_using/use_counter_t.html
From documentation:
Create a keyspace on Linux for use in a single data center, single node cluster. Use the default data center name from the output of the nodetool status command, for example datacenter1.
CREATE KEYSPACE counterks WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 };
Question:
1)Does it mean that i should keep my counters in a separate keyspace
2)If yes, should i declare the keyspace as defined in documentation examples, or thats just an example and i can set my own replication strategy - specifically replicate across data centers.
Thanks
Sorry you had trouble with the instructions. The instructions need to be changed to make it clear that this is just an example and improved by changing RF to 3, for example.
Using a keyspace for a single data center and single node cluster is not a requirement. You need to keep counters in separate tables, but not separate keyspaces; however, keeping tables in separate keyspaces gives you the flexibility to change the consistency and replication from table to table. Normally you have one keyspace per application. See related single vs mutliple keyspace discussion on http://grokbase.com/t/cassandra/user/145bwd3va8/effect-of-number-of-keyspaces-on-write-throughput.

Resources