Cassandra: Exiting due to error while processing commit log during initialization - cassandra

I was loading a large CSV into Cassandra using cassandra-loader.
The VM ran out of disk space during this process and crashed. I allocated more disk space to the VM and tried starting cassandra but it refused to start due to problems with SSTables and commit log.
I could not run nodetool repair as it is only works when the node is online.
I ran sstablescrub which took about 1 hour to finish. So I thought it might have fixed it.
But I still get this error in system.log
ERROR [SSTableBatchOpen:4] 2015-10-23 18:57:45,035 SSTableReader.java:506 - Corrupt sstable /var/lib/cassandra/data/keyspace1/location-777a33d0772911e597a98b820c5778a4/la-1709-big=[TOC.txt, CompressionInfo.db, Statistics.db, Digest.adler32, Data.db, Index.db, Filter.db]; skipping table
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:125) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:86) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:142) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:187) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:179) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.sstable.format.SSTableReader.load(SSTableReader.java:703) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.sstable.format.SSTableReader.load(SSTableReader.java:664) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:363) ~[apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:501) ~[apache-cassandra-2.2.3.jar:2.2.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_60]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
Caused by: java.io.EOFException: null
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.8.0_60]
at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.8.0_60]
at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.8.0_60]
at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:96) ~[apache-cassandra-2.2.3.jar:2.2.3]
... 15 common frames omitted
INFO [main] 2015-10-23 18:57:45,193 ColumnFamilyStore.java:382 - Initializing system_auth.role_permissions
INFO [main] 2015-10-23 18:57:45,201 ColumnFamilyStore.java:382 - Initializing system_auth.resource_role_permissons_index
INFO [main] 2015-10-23 18:57:45,213 ColumnFamilyStore.java:382 - Initializing system_auth.roles
INFO [main] 2015-10-23 18:57:45,233 ColumnFamilyStore.java:382 - Initializing system_auth.role_members
INFO [main] 2015-10-23 18:57:45,240 ColumnFamilyStore.java:382 - Initializing system_traces.sessions
INFO [main] 2015-10-23 18:57:45,252 ColumnFamilyStore.java:382 - Initializing system_traces.events
INFO [main] 2015-10-23 18:57:45,265 ColumnFamilyStore.java:382 - Initializing simplex.songs
INFO [main] 2015-10-23 18:57:45,276 ColumnFamilyStore.java:382 - Initializing simplex.playlists
INFO [pool-2-thread-1] 2015-10-23 18:57:45,289 AutoSavingCache.java:187 - reading saved cache /var/lib/cassandra/saved_caches/KeyCache-ca.db
INFO [pool-2-thread-1] 2015-10-23 18:57:45,313 AutoSavingCache.java:163 - Completed loading (25 ms; 36 keys) KeyCache cache
INFO [main] 2015-10-23 18:57:45,351 CommitLog.java:168 - Replaying /var/lib/cassandra/commitlog/CommitLog-5-1445578022702.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022703.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022704.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022705.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022706.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022707.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022708.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022709.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022710.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022712.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022713.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022714.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022715.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022716.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022717.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022719.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022720.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022721.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022722.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022723.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022724.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022725.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022727.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022728.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022730.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022731.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022732.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022733.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022734.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022736.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022738.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022740.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022741.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022743.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022744.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022745.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022746.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022748.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022749.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022750.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022751.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022752.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022753.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022755.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022756.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022758.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022759.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022760.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022761.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022763.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022764.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022765.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022766.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022767.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022768.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022769.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022770.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022771.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022772.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022773.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022774.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022775.log, /var/lib/cassandra/commitlog/CommitLog-5-1445578022776.log, /var/lib/cassandra/commitlog/CommitLog-5-1445588991268.log, /var/lib/cassandra/commitlog/CommitLog-5-1445589094722.log, /var/lib/cassandra/commitlog/CommitLog-5-1445589149527.log, /var/lib/cassandra/commitlog/CommitLog-5-1445595828633.log, /var/lib/cassandra/commitlog/CommitLog-5-1445595898055.log, /var/lib/cassandra/commitlog/CommitLog-5-1445596033717.log, /var/lib/cassandra/commitlog/CommitLog-5-1445596400441.log, /var/lib/cassandra/commitlog/CommitLog-5-1445596601854.log, /var/lib/cassandra/commitlog/CommitLog-5-1445598032544.log, /var/lib/cassandra/commitlog/CommitLog-5-1445598758663.log, /var/lib/cassandra/commitlog/CommitLog-5-1445601112953.log, /var/lib/cassandra/commitlog/CommitLog-5-1445601937334.log, /var/lib/cassandra/commitlog/CommitLog-5-1445601985416.log, /var/lib/cassandra/commitlog/CommitLog-5-1445604504389.log, /var/lib/cassandra/commitlog/CommitLog-5-1445606516196.log
ERROR [main] 2015-10-23 18:59:05,091 JVMStabilityInspector.java:78 - Exiting due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Mutation checksum failure at 4110758 in CommitLog-5-1445578022776.log
at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:622) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:492) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:388) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:147) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:273) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:513) [apache-cassandra-2.2.3.jar:2.2.3]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622) [apache-cassandra-2.2.3.jar:2.2.3]
Exiting due to error while processing commit log during initialization.
How do I fix this? This is test data, I am okay with losing it. How would one handle this in production to avoid data loss?
Tried setting disk_failure_policy: ignore so that I could run nodetool repair once the server is up. But the server does not start even with this setting.
I am operating a single node and replication factor is 1. Would having more nodes and a >1 replication factor enabled me to fix an issue like this without data loss?
I am using Cassandra 2.2.3

Since you don't care about the data, removing files from \data\commitlogs should be easiest solution.

Having some replication would surely help you to fix this without data loss but it would come with a price.
Despite all your effort you cannot manage to recover your corrupted sstable. So you decide to remove it from your file system to start Cassandra again. If you do not have replication your data is lost. But if you have replication on the cluster, you can possibly fetch the data from other nodes. That is what nodetool repair do !
So nodetool repair does not repair corrupted sstable. Basicallynodetool repair compare tables from node to node to find missing or inconsistent data and then repair it. You can find more information on how it works here.
However nodetool repair is very expensive, it is long and uses a lot of cpu, disk and network. There is this good post about repair benefits and drawbacks.

This is how I fixed the problem with commit logs.
You should only do this if you don't care about preserving the state of your commit logs.
Try to restart cassandra using
sudo systemctl restart cassandra
Then I check
systemctl status cassandra
and see that the status is 'exited' so there is a problem. Check the logs for cassandra using
sudo less /var/log/cassandra/system.log
and see org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Could not read commit log descriptor in file /var/lib/cassandra/commitlog/CommitLog-6-1498210233635.log
Because I don't care about preserving the state of Cassandra I delete all of the commit logs and it now boots up fine
sudo rm /var/lib/cassandra/commitlog/CommitLog*
sudo systemctl restart cassandra
systemctl status cassandra (should confirm that it it now running)

Simply goes in log directory in cassandra and delete the log files.
It work fine....

Related

When running "local-cluster" model in Apache Spark, how to prevent executor from dissociating prematurely?

I have a Spark application that should be tested in both local mode & local-cluster mode, using scalatest.
The local-cluster mode is submitted using this method:
How to scala-test a Spark program under "local-cluster" mode?
The test run successfully, but when terminating the test I got the following error in the log:
22/05/16 17:45:25 ERROR TaskSchedulerImpl: Lost executor 0 on 172.16.224.18: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/2 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/3 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/4 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
22/05/16 17:45:25 ERROR Worker: Failed to launch executor app-20220516174449-0000/5 for Test.
java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:195)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:153)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:142)
at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:77)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:547)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:215)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dis
...
It turns out executor 0 was dropped before the SparkContext is stopped, this triggered a violent self-healing reaction from Spark master that tries to repeatedly launch new executors to compensate for the loss. How do I prevent this from happening?
Spark attempts to recover from failed tasks by attempting to run them again. What you can do to avoid this is to set some properties to 1 in
spark.task.maxFailures (default is 4)
spark.stage.maxConsecutiveAttempts (default is 4)
These properties can be set in $SPARK_HOME/conf/spark-defaults.conf or given as options to spark-submit:
spark-submit --conf spark.task.maxFailures=1 --conf spark.stage.maxConsecutiveAttempts=1
or in the Spark context/session configuration before starting the session.
EDIT:
It looks like your executors are lost due to insufficient memory. You could try to increase:
spark.executor.memory
spark.executor.memoryOverhead
spark.memory.offHeap.size with (spark.memory.offHeap.enabled=true)
(see Spark configuration)
The maximum memory size of container to running executor is determined by the sum of spark.executor.memoryOverhead, spark.executor.memory, spark.memory.offHeap.size and spark.executor.pyspark.memory.

Datastax CassandraRoleManager skipped default role setup some nodes were not ready

I have upgraded DSE cluster with 2 nodes from 5.0.7 to 6.7.3. After upgrade with nodetool status shows both nodes are "UP NORMAL" with apprx 75 GB load on each and cluster works for applications with read write. but getting error during
Nodetool repair -pr some repair failed
Upgrade sstable makes node down.
and observing exception every 10 seconds in system.log file
WARN [OptionalTasks:1] 2019-07-18 08:20:14,495 CassandraRoleManager.java:386 - CassandraRoleManager skipped default role setup: some nodes were not ready
INFO [OptionalTasks:1] 2019-07-18 08:20:14,495 CassandraRoleManager.java:432 - Setup task failed with error, rescheduling
org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE
at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:392)
at org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:214)
at org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:190)
at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.<init>(StorageProxy.java:1541)
at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1524)
at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1447)
at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1325)
at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1274)
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:366)
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:574)
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:307)
at org.apache.cassandra.cql3.QueryProcessor.lambda$processStatement$4(QueryProcessor.java:256)
at io.reactivex.internal.operators.single.SingleDefer.subscribeActual(SingleDefer.java:36)
at io.reactivex.Single.subscribe(Single.java:2700)
at io.reactivex.internal.operators.single.SingleMap.subscribeActual(SingleMap.java:34)
at io.reactivex.Single.subscribe(Single.java:2700)
at io.reactivex.Single.blockingGet(Single.java:2153)
at org.apache.cassandra.concurrent.TPCUtils.blockingGet(TPCUtils.java:75)
at org.apache.cassandra.cql3.QueryProcessor.processBlocking(QueryProcessor.java:352)
at org.apache.cassandra.auth.CassandraRoleManager.hasExistingRoles(CassandraRoleManager.java:396)
at org.apache.cassandra.auth.CassandraRoleManager.setupDefaultRole(CassandraRoleManager.java:370)
at org.apache.cassandra.auth.CassandraRoleManager.doSetupDefaultRole(CassandraRoleManager.java:428)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)

Cassandra : Cannot replace address with a node that is already bootstrapped

One of the existing nodes in a 5 node Cassandra (3.9) cluster fails to come up.
I noticed the node to be down and tried to restart using the command
service cassandra restart
But the node fails to come and I see the below exception in system.log
ERROR [main] 2017-04-14 10:03:49,959 CassandraDaemon.java:747 -
Exception encountered during startup java.lang.RuntimeException:
Cannot replace address with a node that is already bootstrapped
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:752)
~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:648)
~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:548)
~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:385)
[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601)
[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730)
[apache-cassandra-3.9.jar:3.9] WARN [StorageServiceShutdownHook]
2017-04-14 10:03:49,963 Gossiper.java:1508 - No local state or state
is in silent shutdown, not announcing shutdown
WARN [StorageServiceShutdownHook] 2017-04-14 10:51:49,539
Gossiper.java:1508 - No local state or state is in silent shutdown,
not announcing shutdown
Thanks
Have a look at this guide, basically you have a dead node in the cluster, happens all the time ;)
https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html
Plus some additional descriptions:
https://issues.apache.org/jira/browse/CASSANDRA-7356
Plus check that you also remove the address from:
/etc/cassandra/cassandra-env.sh

Cassandra: error in Initializing system.sstable activity

I am getting some problem at cassandra startup. The details of the exceptios is as follows:-
INFO 06:49:10 Initializing system.sstable_activity
ERROR 06:49:10 Exiting due to error while processing commit log during initialization.
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
at org.apache.cassandra.db.commitlog.CommitLogDescriptor.writeHeader(CommitLogDescriptor.java:73) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:168) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.db.commitlog.CommitLogSegment.freshSegment(CommitLogSegment.java:119) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:119) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.1.9.jar:2.1.9]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_60]
What can be the reason for this ?
I fixed this issue by deleting a commit log that was causing the issue.
I found the log in /casandra/data/commitlog .
After I deleted, I re-ran cassandra and all was well. I hope this helps someone!
Dunno what is the reason to this, but restarting my system solved the problem.
This is how I fixed the problem with commit logs. You should only do this if you don't care about preserving the state of your commit logs.
Try to restart cassandra using
sudo systemctl restart cassandra
Then I check
systemctl status cassandra
and see that the status is 'exited' so there is a problem. Check the logs for cassandra using
sudo less /var/log/cassandra/system.log
and see org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Could not read commit log descriptor in file /var/lib/cassandra/commitlog/CommitLog-6-1498210233635.log
Because I don't care about preserving the state of Cassandra I delete all of the commit logs and it now boots up fine
sudo rm /var/lib/cassandra/commitlog/CommitLog*
sudo systemctl restart cassandra
systemctl status cassandra (should confirm that it it now running)

Cassandra compaction fails: IOError IOException Map failed

We have been running Cassandra 1.0.2 in production for many months, with minimal trouble. Lately, we have started to see consistent failures in all nodes, with this error:
ERROR [CompactionExecutor:199] 2013-03-02 00:21:26,179 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:199,1,RMI Runtime]
java.io.IOError: java.io.IOException: Map failed
at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:225)
at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.complete(MmappedSegmentedFile.java:202)
at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:308)
at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:174)
at org.apache.cassandra.db.compaction.CompactionManager$4.call(CompactionManager.java:275)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Map failed
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:758)
at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.createSegments(MmappedSegmentedFile.java:217)
... 9 more
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:755)
... 10 more
INFO [Thread-2] 2013-03-02 00:21:26,222 MessagingService.java (line 488) Shutting down MessageService...
The node dies after this. We remove the oldest data files and start the node again. The node runs for a while, then dies again. This happens for all eight nodes in our ring.
We are using the default compaction strategy (size tiered) and we think it is the best choice for our problem domain.
Q: Why is this happening and what should we do to fix it?

Resources