Cassandra: Truncating a table twice throws consistency exception - cassandra

I have a scalatest suite that's failing, and I have narrowed the cause down to the code that runs before tests and truncates a data table. If I run the following code I can recreate the problem
session.execute(s"TRUNCATE ${dao.tableName};")
session.execute(s"TRUNCATE ${dao.tableName};")
Throws:
Error during truncate: Cannot achieve consistency level ALL
com.datastax.driver.core.exceptions.TruncateException: Error during truncate: Cannot achieve consistency level ALL
at com.datastax.driver.core.exceptions.TruncateException.copy(TruncateException.java:35)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:271)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:187)
at com.datastax.driver.core.Session.execute(Session.java:126)
at com.datastax.driver.core.Session.execute(Session.java:77)
at postingstore.cassandra.dao.PostingGroupDaoTest$$anonfun$2.apply$mcV$sp(PostingGroupDaoTest.scala:43)
at postingstore.cassandra.dao.PostingGroupDaoTest$$anonfun$2.apply(PostingGroupDaoTest.scala:39)
at postingstore.cassandra.dao.PostingGroupDaoTest$$anonfun$2.apply(PostingGroupDaoTest.scala:39)
at org.scalatest.FunSuite$$anon$1.apply(FunSuite.scala:1265)
at org.scalatest.Suite$class.withFixture(Suite.scala:1974)
at ledger.testsupport.JUnitFunSuiteTest.withFixture(JUnitFunSuiteTest.scala:10)
at org.scalatest.FunSuite$class.invokeWithFixture$1(FunSuite.scala:1262)
at ...
Caused by: com.datastax.driver.core.exceptions.TruncateException: Error during truncate: Cannot achieve consistency level ALL
at com.datastax.driver.core.Responses$Error.asException(Responses.java:91)
at com.datastax.driver.core.ResultSetFuture$ResponseCallback.onSet(ResultSetFuture.java:122)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:224)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:361)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:510)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
I'm using the datastax driver 2.0.0-RC2, and have a cluster of three nodes.
Any ideas as to what's going wrong here?

Turns out this was an issue with a node that had got into an inconsistent state due to running out of diskspace

This is because of consistency level . You can not truncate all nodes data using consistency level ALL . you have to put consistency level one or two then it will truncate all data from one nodes after some time this node will truncate all data from other nodes .

Related

Altering Cassandra Compression at production best practices, is nodetool upgradesstables preferred?

We have a cassandra keyspace which has 2 tables in production. We have changed it's compression strategy from LZ4Compressor (which is default) to DeflateCompressor
using
ALTER TABLE "Keyspace"."TableName" WITH compression = {'class': 'DeflateCompressor'};
As we have around 300 GB data in each node of my cassandra 5 node cluster with replication factor 2. Is
nodetool upgradesstables recommended or not as best practice.
From all the sources that we have read
If necessary
I can use nodetool upgradesstables command. But I want to know what is actually the best practice as our data it is in production?
Sources :
When you add compression to an existing column family, existing SSTables on disk are not
compressed immediately. Any new SSTables that are created will be compressed, and any existing SSTables will be
compressed during the normal Cassandra compaction process. If necessary, you can force existing SSTables to be
rewritten and compressed by using nodetool upgradesstables (Cassandra 1.0.4 or later) or nodetool scrub
After all nodes complete upgradesstables A large no of exceptions are being encountered in my cassandra logs
UPDATE - After running upgradesstables now my cluster is throwing a lot of errors
Sample
`
ERROR [ReadRepairStage:74899] 2018-04-08 14:50:09,779
CassandraDaemon.java:229 - Exception in thread
Thread[ReadRepairStage:74899,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed
out - received only 0 responses. at
org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:171)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:182)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:82)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:89)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:50)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-3.10.jar:3.10] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[na:1.8.0_144] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[na:1.8.0_144] at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
~[apache-cassandra-3.10.jar:3.10] at
java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144] EBUG
[ReadRepairStage:74889] 2018-04-08 14:50:07,777 ReadCallback.java:242
- Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key
DecoratedKey(1013727261649388230, 715cb15cc5624c5a930ddfce290a690b)
(d728e9a275616b0e05a0cd1b03bd9ef6 vs d41d8cd98f00b204e9800998ecf8427e)
at
org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233)
~[apache-cassandra-3.10.jar:3.10] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_144] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_144] at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
[apache-cassandra-3.10.jar:3.10] at
java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144] DEBUG
[GossipStage:1] 2018-04-08 14:50:08,490 FailureDetector.java:457 -
Ignoring interval time of 2000213620 for /10.196.22.208 DEBUG
[ReadRepairStage:74899] 2018-04-08 14:50:09,778 DataResolver.java:169
- Timeout while read-repairing after receiving all 1 data and digest responses ERROR [ReadRepairStage:74899] 2018-04-08 14:50:09,779
CassandraDaemon.java:229 - Exception in thread
Thread[ReadRepairStage:74899,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed
out - received only 0 responses. at
org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:171)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:182)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:82)
~[apache-cassandra-3.10.jar:3.10] at
org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:89)
~[apache-cassandra-3.10.jar:3.10]`
When you use nodetool upgradesstables it writes new SSTables from existing but using the new options that you specified. This is IO-intensive process that may affect performance of your cluster, so you need to plan it accordingly. You also need to have enough disk space to perform this operation. This command should also run as the same user that runs Cassandra.
It's really depends on your needs - if it's not urgent, you can simply wait until the normal compaction occurs, and then data will be re-compressed.

Cassandra 3.9 stuck in joining state

I have a 14 node cassandra 3.9 cluster with ~250GB of data on each node. Recently I have been attempting to add a 15th node to this cluster. The node is stuck in Joining state for the past 2 days. netstas is clear. The main thing I find suspicious in the system.log for that joining node is errors like these.
ERROR [Native-Transport-Requests-1] 2018-02-16 15:43:32,635 Message.java:617 - Unexpected exception during request; channel = [id: 0x8ed1cb3b, L:/**.**.**.42:9042 - R:/**.**.**.**:41614]
java.lang.NullPointerException: null
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:88) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.access$300(PasswordAuthenticator.java:59) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:220) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.9.jar:3.9]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.39.Final.jar:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366) [netty-all-4.0.39.Final.jar:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.39.Final.jar:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357) [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
This error message is from a client trying to connect to this node. It seems to fail authentication. How might I proceed in this situation? How should I bring this node to normalcy?
There are two different problems here,
The auth issue that client is facing is related to a bug with Cassandra 3.9 during bootstrap of new nodes. It has been resolved in the later versions of Cassandra as documented here https://issues.apache.org/jira/browse/CASSANDRA-12813.
We had a streaming issue similar to this with Cassandra 3.9. While taking a deeper look at the system.log there was an error with huge partition (partition greater than 100MB) not able to compacted since it exceeds the default commitlog_segment_size. . We were able to get around it once we increased the commitlog_segment_size_in_mb to 512mb. Check for huge partition warnings and adjust the size accordingly.

Running nodetool resetlocalschema causes exceptions complaining system_auth keyspace doesn't exist

I have a 15 node cassandra 3.9 cluster. I recently faced an issue where one of my nodes as piling up GossipStage messages. Following some guidance I found on a similar report I ran 'nodetool resetlocalschema' on that node. While gossip errors like these continue to show in the logs
WARN [GossipTasks:1] 2018-02-11 23:55:34,197 Gossiper.java:771 - Gossip stage has 180317 pending tasks; skipping status check (no nodes will be marked down)
I also see the following exception. Any guidance on how I can overcome this and bring this node back to normal? Also I should mention I have PasswordAuthenticator enabled in the cassandra.yaml file.
ERROR [Native-Transport-Requests-1] 2018-02-11 23:55:33,581 Message.java:617 - Unexpected exception during request; channel = [id: 0xbaa65545,
L:/10.1.21.51:9042 - R:/10.1.86.40:35082]
java.lang.RuntimeException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace
/cf pair (system_auth.roles)
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:107) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.access$300(PasswordAuthenticator.java:59) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:220) ~[ap
ache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.9.jar:3.9]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.39.Final.jar:4.0.39.Fi
nal]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366) [netty-all-4.0.39.Final.jar
:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.39.Final.jar:4.0.39.
Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357) [netty-all-4.0.39.Final.jar:4.0.39.Fina
l]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache
-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace/cf pair (system_auth.roles)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3937) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) ~[guava-18.0.jar:na]
at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:88) ~[apache-cassandra-3.9.jar:3.9]
... 13 common frames omitted
Actually this issue was resolved by simply restarting the seed nodes of my cluster first followed by the rest of the nodes in my cluster. Thanks for all the inputs. Truly appreciate it.

Failed to write statements

i'm using spark with cassandra and i want to write data into my cassandra table:
CREATE TABLE IF NOT EXISTS MyTable(
user TEXT,
date TIMESTAMP,
event TEXT,
PRIMARY KEY((user ),date , event)
);
But i got this error :
java.io.IOException: Failed to write statements to KeySpace.MyTable.
at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:145)
at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:120)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:100)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:99)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:151)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:99)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:120)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/04/28 17:57:47 WARN TaskSetManager: Lost task 13.2 in stage 1.0 (TID 43, dev2-cim.aid.fr): TaskKilled (killed intentionally)
and this warnings in my Cassandra log File :
WARN [SharedPool-Worker-2] 2015-04-28 16:45:21,219 BatchStatement.java:243 - Batch of prepared statements for [*********] is of size 8158, exceeding specified threshold of 5120 by 3038
after making some searchs in the Internet, i've found this link who explain how he fixes the same problem :
http://progexc.blogspot.fr/2015/03/write-batch-size-error-spark-cassandra.html
So, Now i've modified my spark algorithm to add :
conf.set("spark.cassandra.output.batch.grouping.key", "None")
conf.set("spark.cassandra.output.batch.size.rows", "10")
conf.set("spark.cassandra.output.batch.size.bytes", "2048")
this values remove the warning message i got in cassandra Logs, but i still have the same error : Failed to write statements.
In my spark log fail i found this error :
Failed to execute:
com.datastax.spark.connector.writer.RichBatchStatement#67827d57
com.datastax.driver.core.exceptions.InvalidQueryException: Key may not be empty
at com.datastax.driver.core.Responses$Error.asException(Responses.java:103)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:140)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:293)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:455)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:734)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.handler.timeout.IdleStateAwareChannelUpstreamHandler.handleUpstream(IdleStateAwareChannelUpstreamHandler.java:36)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.timeout.IdleStateHandler.messageReceived(IdleStateHandler.java:294)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
I had the same problem and found the solution in the comments above (by Amine CHERIFI and maasg).
The column corresponding to the primary key was not always filled with a proper value (in my case with an empty string "").
This triggered the ERROR
ERROR QueryExecutor: Failed to execute: \
com.datastax.spark.connector.writer.RichBatchStatement#26ad2668 \
com.datastax.driver.core.exceptions.InvalidQueryException: Key may not be empty
The solution was to provide a default non-empty string.
If you are running in yarn-cluster mode, don't forget to check entire log on yarn using yarn logs -applicationId <appId> --appOwner <appOwner>.
This gave me more reasons for failure than the logs on yarn webUI
Caused by: com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency LOCAL_QUORUM (2 required but only 1 alive)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:50)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:37)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:266)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:246)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
... 11 more
Solution is to set the spark.cassandra.output.consistency.level=ANY in your spark-defaults.conf
I resolved the issue by restarting my cluster as will as nodes.
Following is the things I tried.
I am also facing same issue I tried all the options above you mentioned in the blog but not success.
My data size is 174gb. Total 174 Gb data , My cluster having 3 node, each node having 16 cores and 48 gb ram.
I tried to lode 174gb in a single shot at that time i have the same issue.
After that I segregated 174 gb in 109 file each 1.6 Gb and tried to lode, this time I faced the same problem again after loading 100 files(each 1.6 gb).
I thought may be the problem with data in 101 file. I tried to load the first file and tried to lode the first file into the new table, and tried to lode new data into new table but all this cases having the issue.
Then I think it is the problem with cassandra cluster and restarted the cluster and nodes also.
Then the issue gone away.
Add a breakpoint in "com/datastax/spark/connector/writer/AsyncExecutor.scala:45 ", you can get the real exception.
In my case, replication_factor of my keyspace is 2, but I have only one alive.

Spark cassandra connector connection error , no more host to try

I have an issue related to datastax spark-Cassandra-connector. When I am trying to test our spark-Cassandra connections, I use bellow code. My problem is this code throw an exception after some time like half an hour. I think there is some connection issue, can anybody help, I am stuck.
SparkConf conf = new SparkConf(true)
.setMaster("local")
.set("spark.cassandra.connection.host",
Config.CASSANDRA_CONTACT_POINT)
.setAppName(Config.CASSANDRA_DB_NAME)
.set("spark.executor.memory",
Config.Spark_Executor_Memory);
SparkContext javaSparkContext = new SparkContext(conf);
SparkContextJavaFunctions functions = CassandraJavaUtil.javaFunctions(javaSparkContext);
for(;;){
JavaRDD<ObjectHandler> obj = functions.cassandraTable(Config.CASSANDRA_DB_NAME,
"my_users", ObjectHandler.class);
System.out.println("#####" + obj.count() + "#####");
}
Error:
java.lang.OutOfMemoryError: Java heap space
at org.jboss.netty.buffer.HeapChannelBuffer.slice(HeapChannelBuffer.java:201)
at org.jboss.netty.buffer.AbstractChannelBuffer.readSlice(AbstractChannelBuffer.java:323)
at com.datastax.driver.core.CBUtil.readValue(CBUtil.java:247)
at com.datastax.driver.core.Responses$Result$Rows$1.decode(Responses.java:395)
at com.datastax.driver.core.Responses$Result$Rows$1.decode(Responses.java:383)
at com.datastax.driver.core.Responses$Result$2.decode(Responses.java:201)
at com.datastax.driver.core.Responses$Result$2.decode(Responses.java:198)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:182)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
19:11:12.311 DEBUG [New I/O worker #1612][com.datastax.driver.core.Connection] Defuncting connection to /192.168.1.26:9042
com.datastax.driver.core.TransportException: [/192.168.1.26:9042] Unexpected exception triggered (java.lang.OutOfMemoryError: Java heap space)
at com.datastax.driver.core.Connection$Dispatcher.exceptionCaught(Connection.java:614)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at org.jboss.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:566)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.jboss.netty.buffer.HeapChannelBuffer.slice(HeapChannelBuffer.java:201)
at org.jboss.netty.buffer.AbstractChannelBuffer.readSlice(AbstractChannelBuffer.java:323)
at com.datastax.driver.core.CBUtil.readValue(CBUtil.java:247)
at com.datastax.driver.core.Responses$Result$Rows$1.decode(Responses.java:395)
at com.datastax.driver.core.Responses$Result$Rows$1.decode(Responses.java:383)
at com.datastax.driver.core.Responses$Result$2.decode(Responses.java:201)
at com.datastax.driver.core.Responses$Result$2.decode(Responses.java:198)
at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:182)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
... 3 more
19:11:13.549 DEBUG [New I/O worker #1612][com.datastax.driver.core.Connection] [/192.168.1.26:9042-1] closing connection
19:11:12.311 DEBUG [main][com.datastax.driver.core.ControlConnection] [Control connection] error on /192.168.1.26:9042 connection, no more host to try
com.datastax.driver.core.ConnectionException: [/192.168.1.26:9042] Operation timed out
at com.datastax.driver.core.DefaultResultSetFuture.onTimeout(DefaultResultSetFuture.java:138)
at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:763)
at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:546)
at org.jboss.netty.util.HashedWheelTimer$Worker.notifyExpiredTimeouts(HashedWheelTimer.java:446)
at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:395)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at java.lang.Thread.run(Thread.java:722)
19:11:13.551 DEBUG [main][com.datastax.driver.core.Cluster] Shutting down
Exception in thread "main" com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.1.26:9042 (com.datastax.driver.core.ConnectionException: [/192.168.1.26:9042] Operation timed out))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:195)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1143)
at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:313)
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:166)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$4.apply(CassandraConnector.scala:151)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$4.apply(CassandraConnector.scala:151)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:36)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:61)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:72)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:97)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:108)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:131)
at com.datastax.spark.connector.rdd.CassandraRDD.tableDef$lzycompute(CassandraRDD.scala:206)
at com.datastax.spark.connector.rdd.CassandraRDD.tableDef(CassandraRDD.scala:205)
at com.datastax.spark.connector.rdd.CassandraRDD.<init>(CassandraRDD.scala:212)
at com.datastax.spark.connector.SparkContextFunctions.cassandraTable(SparkContextFunctions.scala:48)
at com.datastax.spark.connector.SparkContextJavaFunctions.cassandraTable(SparkContextJavaFunctions.java:47)
at com.datastax.spark.connector.SparkContextJavaFunctions.cassandraTable(SparkContextJavaFunctions.java:89)
at com.datastax.spark.connector.SparkContextJavaFunctions.cassandraTable(SparkContextJavaFunctions.java:140)
at com.shephertz.app42.paas.spark.SegmentationWorker.main(SegmentationWorker.java:52)
It looks like you ran out of heap space:
java.lang.OutOfMemoryError: Java heap space
The java-driver (what the spark-connector uses for interacting with cassandra) defuncted a connection because an OutOfMemoryError was thrown while processing a request. When a connection is defuncted, its host is brought down.
The NoHostAvailableException is likely being raised because all of your hosts were brought down because their connections were defuncted, likely because of OutOfMemoryError.
Do you know why you may be getting an OutOfMemoryError? What is your heap size? Are you doing anything that would cause a lot of objects to be on heap in your JVM? Do you possibly have a memory leak?
Your error probably lies in how the JVM is configured. If the settings are not correctly tuned, garbage collection could be causing some issues. If you are using Cassandra > 2.0 see Datastax's "Tuning Java Resources"
How Cassandra uses memory from the document:
Using a java-based system like Cassandra, you can typically allocate
about 8GB of memory on the heap before garbage collection pause time
starts to become a problem. Modern machines have much more memory than
that and Cassandra can make use of additional memory as page cache
when files on disk are accessed. Allocating more than 8GB of memory on
the heap poses a problem due to the amount of Cassandra metadata about
data on disk. The Cassandra metadata resides in memory and is
proportional to total data. Some of the components grow proportionally
to the size of total memory.
In Cassandra 1.2 and later, the Bloom filter and compression offset
map that store this metadata reside off-heap, greatly increasing the
capacity per node of data that Cassandra can handle efficiently. In
Cassandra 2.0, the partition summary also resides off-heap.
Please post your JVM options for further help.

Resources