Cassandra is giving digest mismatch error. Restarting service on all nodes isn't helping.
ERROR 10:55:11 Exception in thread Thread[HintsDispatcher:2,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch exception
at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.io.IOException: Digest mismatch exception
at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) ~[apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) ~[apache-cassandra-3.0.14.jar:3.0.14]
... 16 common frames omitted
After some digging I found this https://issues.apache.org/jira/browse/CASSANDRA-13696 and I think I need to delete hintfiles so nodes can come to a consistent state, but cassandra is running in DCOS/mesosphere and I am not able to connect nodetool to truncate hintfiles
Any way I can delete hintfiles? Or any other way to make cluster consistent? Thanks in Advance.
Your files are probably getting corrupted.
I would try to keep your data in mount points managed by something like Rex-Ray
Related
On Apache Cassandra 3.11.1, during cassandra decommission I noticed that the node tries to send hints. I checked the hints folder and found hints that are very old or are present for already removed nodes. I decided to delete them so they are not part of the decommission process, but after decommissioning a node, the process fails and the node gets stuck in UL state.
Stacktrace:
ERROR [HintsDispatcher:3] 2022-12-12 17:58:39,364 CassandraDaemon.java:228 - Exception in thread Thread[HintsDispatcher:3,1,RMI Runtime]
java.lang.RuntimeException: java.nio.file.NoSuchFileException: /var/lib/cassandra/hints/xxxx-1670867104313-1.hints
at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.ChannelProxy.<init>(ChannelProxy.java:66) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.ChecksummedDataInput.open(ChecksummedDataInput.java:77) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsReader.open(HintsReader.java:78) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatcher.create(HintsDispatcher.java:73) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:273) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:260) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:238) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:217) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) ~[na:1.8.0_131]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[na:1.8.0_131]
at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_131]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) ~[na:1.8.0_131]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_131]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) ~[na:1.8.0_131]
at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:186) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:159) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.nio.file.NoSuchFileException: /var/lib/cassandra/hints/0891d19f-7ba9-4fc6-973c-79f98253cf4e-1670867104313-1.hints
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[na:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[na:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[na:1.8.0_131]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[na:1.8.0_131]
at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[na:1.8.0_131]
at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[na:1.8.0_131]
at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51) ~[apache-cassandra-3.11.1.jar:3.11.1]
... 25 common frames omitted
Final Error:
ERROR [RMI TCP Connection(350)-127.0.0.1] 2022-12-12 18:05:33,190 StorageService.java:3954 - Error while decommissioning node
java.lang.RuntimeException: java.nio.file.NoSuchFileException: /var/lib/cassandra/hints/xxxx-1670867104313-1.hints
at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.io.util.ChannelProxy.<init>(ChannelProxy.java:66) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.ChecksummedDataInput.open(ChecksummedDataInput.java:77) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsReader.open(HintsReader.java:78) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatcher.create(HintsDispatcher.java:73) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:273) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:260) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:238) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:217) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) ~[na:1.8.0_131]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[na:1.8.0_131]
at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_131]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) ~[na:1.8.0_131]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) ~[na:1.8.0_131]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_131]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) ~[na:1.8.0_131]
at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:186) ~[apache-cassandra-3.11.1.jar:3.11.1]
at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:159) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_131]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) ~[apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.nio.file.NoSuchFileException: /var/lib/cassandra/hints/0891d19f-7ba9-4fc6-973c-79f98253cf4e-1670867104313-1.hints
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[na:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[na:1.8.0_131]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[na:1.8.0_131]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[na:1.8.0_131]
at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[na:1.8.0_131]
at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[na:1.8.0_131]
at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51) ~[apache-cassandra-3.11.1.jar:3.11.1]
... 25 common frames omitted
How can I avoid such a scenario?
I tried nodetool truncatehints, but it does nothing on the node.
At this point, I would just run a nodetool removenode on it. The removenode process will:
get rid of a stuck DL/UL node.
move data to its new token ranges from the other replicas.
What happens in case all data was already migrated from the leaving node? will the removenode command then stream again and create duplicate data?
Yes, it streams it again, but it doesn’t create “duplicate data;” not in the same way that in-place writes do.
So a cleanup will clear out this re-streamed data?
No. The streaming happens at the SSTable file level. So essentially a whole new file is being streamed during the decomm/remove process. If there was already duplicate, in-place-updated data present, it still will be there on its new node. If there was only a single write at the partition level, then that's all that will be there. If it already exists, this process will simply write one file over another, and Cassandra will never know it was even there.
When you add or remove a node, a token range calculation is triggered. This means that some nodes end up containing data that they are no longer responsible for (because another node now is). Running a nodetool cleanup removes data which:
has already been streamed somewhere else.
remains on a node which
will never use it.
I have a 15 node cassandra 3.9 cluster. I recently faced an issue where one of my nodes as piling up GossipStage messages. Following some guidance I found on a similar report I ran 'nodetool resetlocalschema' on that node. While gossip errors like these continue to show in the logs
WARN [GossipTasks:1] 2018-02-11 23:55:34,197 Gossiper.java:771 - Gossip stage has 180317 pending tasks; skipping status check (no nodes will be marked down)
I also see the following exception. Any guidance on how I can overcome this and bring this node back to normal? Also I should mention I have PasswordAuthenticator enabled in the cassandra.yaml file.
ERROR [Native-Transport-Requests-1] 2018-02-11 23:55:33,581 Message.java:617 - Unexpected exception during request; channel = [id: 0xbaa65545,
L:/10.1.21.51:9042 - R:/10.1.86.40:35082]
java.lang.RuntimeException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace
/cf pair (system_auth.roles)
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:107) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.access$300(PasswordAuthenticator.java:59) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:220) ~[ap
ache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.9.jar:3.9]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.39.Final.jar:4.0.39.Fi
nal]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366) [netty-all-4.0.39.Final.jar
:4.0.39.Final]
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.39.Final.jar:4.0.39.
Final]
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357) [netty-all-4.0.39.Final.jar:4.0.39.Fina
l]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache
-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Unknown keyspace/cf pair (system_auth.roles)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3937) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) ~[guava-18.0.jar:na]
at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:88) ~[apache-cassandra-3.9.jar:3.9]
... 13 common frames omitted
Actually this issue was resolved by simply restarting the seed nodes of my cluster first followed by the rest of the nodes in my cluster. Thanks for all the inputs. Truly appreciate it.
When I add an external library to Zeppelin's Spark interpreter, sometimes, in a very random way, the spark interpreter does not start. Some other times, it starts but then I cannot access the added library. And sometimes it just works.
My best guess is that some jars are conflicting.
This is the output error
java.lang.NullPointerException
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:821)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:612)
at org.apache.zeppelin.scheduler.Job.run(Job.java:186)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I'm trying to consume messages from RabbitMQ in Spark Streaming following this answer: https://stackoverflow.com/a/38172737/1344854. I'm failing with IllegalArgumentException and I don't know why.
6/09/05 13:23:22 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.IllegalArgumentException: ssl://user:password#bunny.cloudamqp.com:8883/vhost
at org.eclipse.paho.client.mqttv3.MqttConnectOptions.validateURI(MqttConnectOptions.java:458)
at org.eclipse.paho.client.mqttv3.MqttAsyncClient.<init>(MqttAsyncClient.java:273)
at org.eclipse.paho.client.mqttv3.MqttAsyncClient.<init>(MqttAsyncClient.java:167)
at org.eclipse.paho.client.mqttv3.MqttClient.<init>(MqttClient.java:224)
at org.apache.spark.streaming.mqtt.MQTTReceiver.onStart(MQTTInputDStream.scala:71)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I tried passing various combinations of brokerUrl argument to the MQTTUtils.createStream constructor including tcp:
tcp://user:password#bunny.cloudamqp.com:1883/vhost
Why is it failing to validate and how to fix it?
6
I tried to implement what is explained here. It is working when i keep number of partition in custom partition equal to one but when i change this keep any other value it gives out array out of bound exception
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, deenbandhu): java.lang.ArrayIndexOutOfBoundsException: -2
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:920)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:918)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:918)
at DataSetCreation$.main(CreateDataSet.scala:100)
at DataSetCreation.main(CreateDataSet.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -2
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am unable to figure it out what is causing this error.
Thanks in advance
I found the issue in my Custom partitoner when calculating hashcode and getting partition number it is giving negative number and hence giving array out of bound error now i have taken the absolute value and it is working fine now.