Cassandra cluster simple query error - cassandra

I'm learning Cassandra and I have a problem. I have a cluster with 2 computers (node A and node B). On a computer I can create new users and keyspaces and on the other one I can use this users and keyspaces. But if i create a new table on any of these computers (inside cassandra, on a keyspace), i can't see this new table with a simple query statement like SELECT * FROM table or SELECT * FROM keyspace.table. Cassandra displays this error "ServerError: <ErrorMessage code=0000 [Server error] message='java.lang.AssertionError">"
if i use nodetool status on the node A (node+seeder) displays an error:
java.lang.RuntimeException: No nodes present in the cluster. Has this node finished startin up?
but if i use nodetool status on the node B (only node), displays a node: the node B.
Keyspace stamement:
CREATE KEYSPACE demo WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
Cassandra 3.2 is installed on Debian
What can i do? any ideas? I can't fix it

Related

Different results for same query in different Cassandra nodes

I have 3 cassandra nodes, when I execute a query, 2 nodes are giving same response but 1 node is giving different response
Suppose I executed following query
select * from employee;
Node1 and Node2 are giving 2 rows but Node3 is giving 0 rows(empty response)
How to solve this issue
1.You are not using Network topology.
2.Your replication factor is 2.
Simple strategy : Use only for a single datacenter and one rack. SimpleStrategy places the first replica on a node determined by the partitioner. Additional replicas are placed on the next nodes clockwise in the ring without considering topology (rack or datacenter location).
Go to this link :
https://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archDataDistributeReplication.html
I did the following steps, then problem was solved and now the data is in sync in all the 3 nodes
run the command nodetool rebuild on the instances and also
update 'replication_factor': '2' to 'replication_factor': '3'

Lost data after running nodetool decommission

I have a 3 node cluster with 1 seed and nodes in different zones. All running in GCE with GoogleCLoudSnitch.
I wanted to change hardware on each node so I started with adding a new seed in a different region which joined perfectly to the cluster. Then I started with "nodetool decommission" and when done I removed the the node when it is down and "nodetool status" states it's not in the cluster. I did this for all nodes and lastly I did it on the "extra" seed in the different region just to remove it to get back to a 3 node cluster.
We lost data! What can possibly be the problem? I saw a commando, "nodetool rebuild", which I ran and actually got some data back. "nodetool cleanup" didn't help either. Should I have run "nodetool flush" prior to "decommission"?
At the time of running "decommission" most keyspaces had ..
{'class' : 'NetworkTopologyStrategy', 'europe-west1' : 2}"
Should I first altered key spaces to include the new region/datacenter, which would be "'europe-west3' : 1" since only one node exist in that datacenter? I also noted that some keyspaces in the cluster had by mistake ..
{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 }
Could this have caused the loss of data? It seems that it was in the "SimpleStrategy keyspaces" the data was lost.
(Disclaimer: I'm a ScyllaDB employee)
Did you 1st add new nodes to replace the ones you are decommissioning and configured the keyspace replication strategy accordingly? (you only mentioned the new seed node in your description, you did not mention if you did it for the other nodes).
Your data loss can very well be a result of the following:
Not altering the keyspaces to include the new region/zone with the proper replication strategy and replication factor.
Keyspaces that were configured with simple strategy (no netwrok aware) replication policy and replication factor 1. This means that the data was stored only in 1 node, and once that node went down and decommissioned, you basically lost the data.
Did you by any chance took snapshots and stored them outside your cluster? If you did you could try and restore them.
I would highly recommend reviewing these procedures for better understanding and the proper way to perform the procedure you intended to perform:
http://docs.scylladb.com/procedures/add_dc_to_exist_dc/
http://docs.scylladb.com/procedures/replace_running_node/

Spark Cassandra Issue with KeySpace Replication

I have created table in Cassandra with below commands:
CREATE KEYSPACE test WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', 'dc1' : 3 } AND DURABLE_WRITES = false;
use test;
create table demo(id int primary key, name text);
Once the table got created successfully, I was running the below code to write the data into Cassandra from Spark.
But facing below error
Code Snippet of Spark
import com.datastax.spark.connector._
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import com.datastax.spark.connector.cql._
val connectorToClusterOne = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host","xx.xx.xx.xx").set("spark.cassandra.auth.username", "xxxxxxx").set("spark.cassandra.auth.password", "xxxxxxx"))
---K/V---
val data = sc.textFile("/home/ubuntu/test.txt").map(_.split(",")).map(p => demo(p(0).toInt,p(1)))
implicit val c = connectorToClusterOne
data.saveToCassandra("test","demo")
BELOW IS THE ERROR DESCRIPTION: .
Error while computing token map for keyspace test with datacenter dc1: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
Could any one suggest what could be the possible reason for this.
This error is usually because either the request is not being directed at the correct cluster or the datacenter does not exist or has an incorrect name.
To make sure you are connecting to the correct cluster double check the connection host used for your spark application.
To check the datacenter, use nodetool status to make sure that the datacenter which you requested exists and includes no extraneous whitespace.
Lastly, it could be possible that all the nodes in the datacenter are down, so double check this as well.

Cassandra - Every command issued in CQLSH throws errors

Cassandra gives me serious headache. Yesterday, everything was running fine and then I dropped a table, ran a CQLSSTableWriter which somehow threw errors about my Lucene index (for not being on classpath or the like) several times and now, every command I issue in the cqlsh is throwing errors.
CREATE KEYSPACE IF NOT EXISTS mydata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
takes a while and then throws:
Warning: schema version mismatch detected, which might be caused by DOWN nodes;
if this is not the case, check the schema versions of your nodes in system.local and system.peers.
OperationTimedOut: errors={}, last_host=XXX.XXX.XXX.20
After that I will create a new table and it will also throw the same error.
cqlsh:mydata> create table test (id text PRIMARY KEY, id2 text);
Warning: schema version mismatch detected, which might be caused by DOWN nodes; if this is not the case, check the schema versions of your nodes in system.local and system.peers.
OperationTimedOut: errors={}, last_host=XXX.XXX.XXX.20
last_host always shows the ip of the host where I run the cqlsh on. I have tried the same commands with different nodes too.
The keyspace and table however is still being created! The error says something about mismatching schema versions, so I made sure and ran:
nodetool describecluster
And the output of it shows that all my nodes are on the same schema. No schema mismatches. I also issued nodetool resetlocalschema before, without any luck though.
When I go ahead and insert some data into the newly created table, following error arises. Note that the insert statement does not return an error.
cqlsh:mydata> insert into test(id, id2) values('test1', 'test2');
cqlsh:mydata> select * from mydata.test ;
Traceback (most recent call last):
File "/usr/bin/cqlsh.py", line 1314, in perform_simple_statement
result = future.result()
File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", line 3122, in result
raise self._final_exception
Unavailable: code=1000 [Unavailable exception] message="Cannot achieve consistency level ONE" info={'required_replicas': 1, 'alive_replicas': 0, 'consistency': 'ONE'}
Note that I have one datacenter and five nodes. I do not plan to use more than one datacenter in the future. [cqlsh 5.0.1 | Cassandra 3.0.8 | CQL spec 3.4.0 | Native protocol v4]
I have also restarted Cassandra multiple times. nodetool status shows that all nodes are up and running. Does anyone have a clue about what's going on?
I fixed this by...
dropping all tables in the keyspace
running alter keyspace mydata WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1'}; instead of SimpleStrategy
restarting the cassandra service on all nodes
recreating all tables
runnning nodetool repair
Now I am able to insert data and query data again. Still not quite sure what was the cause of all this to be honest though.

NetworkTopologyStrategy on single cassandra node

I have created a keyspace in cassandra once using NetworkTopologyStrategy and next time using SimpleStrategy with the following syntax :
Keyspace definition:
CREATE KEYSPACE cw WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter16' : 1 };
CREATE KEYSPACE cw WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 1}
Output of bin/nodetool ring :
Datacenter: 16
==========
Address Rack Status State Load Owns Token
172.16.4.196 4 Up Normal 35.92 KB 100.00% 0
When i create one table in NetworkTopologyStrategy keyspace and do the select * query on the table. It returns the following error :
Unable to complete request: one or more nodes were unavailable
Whereas it works fine in SimpleStrategy keyspace why is it so? Can't we use the NetworkTopologyStrategy on single cassandra node cluster?
While everyone else is right, you are already using a different snitch as your data center name is '16'. In your keyspace definition, you have Datacenter: 16. That means data center name is actually '16'.
Try this:
CREATE KEYSPACE cw WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '16' : 1 };
By default cassandra is configured to use SimpleSnitch.
SimpleSnitch does not recognize datacenter and rack information, hence can use only SimpleStrategy.
To Change the Snitch you have to edit following in cassandra.yaml
endpoint_snitch: CHANGE THIS TO WHATEVER YOU WANT
also you have to then change corresponding properties file to define datacenter and racks
You have to define a network-aware snitch in order to use NetworkTopologyStrategy. See this document for more information: http://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureSnitchPFSnitch_t.html

Resources