Cassandra Streaming error - Unknown keyspace system_traces - cassandra

In our dev cluster, which has been running smooth before, when we replace a node (which we have been doing constantly) the following failure occurs and prevents the replacement node from joining.
cassandra version is 2.0.7
What can be done about it?
ERROR [STREAM-IN-/10.128.---.---] 2014-11-19 12:35:58,007 StreamSession.java (line 420) [Stream #9cad81f0-6fe8-11e4-b575-4b49634010a9] Streaming error occurred
java.lang.AssertionError: Unknown keyspace system_traces
at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:260)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
at org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:239)
at org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:436)
at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:368)
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:289)
at java.lang.Thread.run(Thread.java:745)

I got the same error while I was trying to setup my cluster, and as I was experimenting with different switches in cassandra.yaml, I restarted the service multiple times and removed the system dir under data directory (/var/lib/cassandra/data as mentioned here).
I guess for some reason cassandra tries to load system_traces keyspace and fails (the other dir under /var/lib/cassandra/data), and nodetool throws this error. You can just remove both system and system_traces before starting cassandra service, or even better delete all content of bommitlog, data and savedcache there.
This works obviously if you dont have any data just yet in the system.

Related

advanced.session-leak after sometime of starting spark thrift server with datastax cassandra connector

Hi I am getting the following error after some time of inactivity.
Error: Error running query: com.typesafe.config.ConfigException$Missing: withValue(advanced.reconnection-policy.base-delay): No configuration setting found for key 'advanced.session-leak' (state=,code=0)
restarting thrift server seems to solve the issue for sometime.

Cassandra: Streaming error - Compressed lengths mismatch

I am restoring a snapshot in Cassandra using sstableloader. The sstable loading process fails for some of the nodes in the cluster with error
Error at sstableloader command:
Streaming to the following hosts failed:
[/10.x.x.x, /10.x.x.x, /10.x.x.x]
java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:98)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:48)
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
Error in logs for one the failed node:
[Stream #bac90a-32] Streaming error occurred on session with peer 10.x.x.x
java.io.IOException: Compressed lengths mismatch
at org.apache.cassandra.io.compress.LZ4Compressor.uncompress(LZ4Compressor.java:147) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.streaming.compress.CompressedInputStream.decompress(CompressedInputStream.java:163) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.streaming.compress.CompressedInputStream.decompressNextChunk(CompressedInputStream.java:109) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.streaming.compress.CompressedInputStream.read(CompressedInputStream.java:121) ~[apache-cassandra-3.11.4.jar:3.11.4]
What can be the possible cause of length mismatch?
This generally indicates that the sstable that was streamed to the node was corrupted. You could try running nodetool scrub on the source node and see if any corruption is identified by reviewing the scrub output in system.log. Once you have fixed the source corruption, you could try taking the snapshot and loading it using sstableloader again.
You could also try to identify which sstable from the snapshot is corrupted by trying to run sstabledump on each table. If you get the same error when running sstabledump, you know you have found the corrupted file. You could delete that file and try to load the rest of them.

CouchDB v1.7.1 database replication to CouchDB v2.3.0 database fails

In Fauxton, I've setup a replication rule from a CouchDB v1.7.1 database to a new CouchDB v2.3.0 database.
The source does not have any authentication configured. The target does. I've added the username and password to the Job Configuration.
It looks like the replication got stuck somewhere in the process. 283.8 KB (433 documents) are present in the new database. The source contains about 18.7 MB (7215 docs) of data.
When restarting the database, I'm always getting the following error:
[error] 2019-02-17T17:29:45.959000Z nonode#nohost <0.602.0> --------
throw:{unauthorized,<<"unauthorized to access or create database
http://my-website.com/target-database-name/">>}:
Replication 5b4ee9ddc57bcad01e549ce43f5e31bc+continuous failed to
start "https://my-website.com/source-database-name/ "
-> "http://my-website.com/target-database-name/ " doc
<<"shards/00000000-1fffffff/_replicator.1550593615">>:<<"1e498a86ba8e3349692cc1c51a00037a">>
stack:[{couch_replicator_api_wrap,db_open,4,[{file,"src/couch_replicator_api_wrap.erl"},{line,114}]},{couch_replicator_scheduler_job,init_state,1,[{file,"src/couch_replicator_scheduler_job.erl"},{line,584}]}]
I'm not sure what is going on here. From the logs I understand there's an authorization issue. But the database is already present (hence, it has been replicated partially already).
What does this error mean and how can it be resolved?
The reason for this error is that the CouchDB v2.3.0 instance was being re-initialized on reboot. It required me to fill-in the cluster configuration again.
Therefore, the replication could not continue until I had the configuration re-applied.
The issue with having to re-apply the cluster configuration has been solved in another SO question.

Unable to start Kudu master

While starting kudu-master, I am getting the below error and unable to start kudu cluster.
F0706 10:21:33.464331 27576 master_main.cc:71] Check failed: _s.ok() Bad status: Invalid argument: Unable to initialize catalog manager: Failed to initialize sys tables async: on-disk master list (hadoop-master:7051, slave2:7051, slave3:7051) and provided master list (:0) differ. Their symmetric difference is: :0, hadoop-master:7051, slave2:7051, slave3:7051
It is a cluster of 8 nodes and i have provided 3 masters as given below in master.gflagfile on master nodes.
--master_addresses=hadoop-master,slave2,slave3
TL;DR
If this is a new installation, working under the assumption that master ip addresses are correct, I believe the easiest solution is to
Stop kudu masters
Nuke the <kudu-data-dir>/master directory
Start kudu masters
Explanation
I believe the most common (if not only) cause of this error (Failed to initialize sys tables async: on-disk master list (hadoop-master:7051, slave2:7051, slave3:7051) and provided master list (:0) differ.) is when a kudu master node gets added incorrectly. The error suggests that kudu-master thinks it's running on a single node rather than 3-node cluster.
Maybe you did not intend to "add a node", but that's most likely what happened. I'm saying this because I had the same problem; after some googling and debugging, I discovered that during the installation, I started kudu-master before putting the correct IP address in master.gflagfile, so that kudu-master was spun up thinking it was running on a single node, not 3 node. Using steps above to clean install kudu-master again, my problem was solved.

cassandra sstable-loader error: "Got an unknow host from describe_ring()"

I am trying to load sstables to cassandra cluster of two nodes with sstable-loader utility provided in cassandra 0.8.4
1) I have loaded the data successfully on single node environment .
2) As i have created the cluster of two nodes ,while loading ,after gossip it throws exception
java.lang.RuntimeException: Got an unknow host from describe_ring()
This is a bug in 0.8.4 (https://issues.apache.org/jira/browse/CASSANDRA-3044). It's fixed in 0.8.5; you can test that by following the link on the release thread here.

Resources