I run spark to write data to hbase, but found NoSuchMethodException:
15/10/23 18:45:21 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, dn18-formal.i.nease.net): java.lang.NoSuchMethodError: com.google.common.io.ByteStreams.limit(Ljava/io/InputStream;J)Ljava/io/InputStream;
I found guava.jar in hadoop/hbase dir and the version is 12.0, but com.google.common.io.ByteStreams.limit is since 14.0, so NoSuchMethodException occurs.
I try to run spark-submmit by - -jars,but the same. and I try to add
configuration.set("spark.executor.extraClassPath", "/home/ljh")
configuration.set("spark.driver.userClassPathFirst","true");
to my code, still the same.
How to solve this? How to remove the guava.jar in hadoop/hbase from class path? why it does not use the guava.jar in spark dir.
Here is my code:
rdd.foreach({ res =>
val configuration = HBaseConfiguration.create();
configuration.set("hbase.zookeeper.property.clientPort", "2181");
configuration.set("hbase.zookeeper.quorum", “ip.66");
configuration.set("hbase.master", “ip:60000");
configuration.set("spark.executor.extraClassPath", "/home/ljh")
configuration.set("spark.driver.userClassPathFirst","true");
val hadmin = new HBaseAdmin(configuration);
configuration.clear();
configuration.addResource("/home/hadoop/conf/core-default.xml")
configuration.addResource("/home/hadoop/conf/core-site.xml")
configuration.addResource("/home/hadoop/conf/mapred-default.xml")
configuration.addResource("/home/hadoop/conf/mapred-site.xml")
configuration.addResource("/home/hadoop/conf/yarn-default.xml")
configuration.addResource("/home/hadoop/conf/yarn-site.xml")
configuration.addResource("/home/hadoop/conf/hdfs-default.xml")
configuration.addResource("/home/hadoop/conf/hdfs-site.xml")
configuration.addResource("/home/hadoop/conf/hbase-default.xml")
configuration.addResource("/home/ljhn1829/hbase-site.xml")
val table = new HTable(configuration, "ljh_test2");
var put = new Put(Bytes.toBytes(res.toKey()));
put.add(Bytes.toBytes("basic"), Bytes.toBytes("name"), Bytes.toBytes(res.totalCount + "\t" + res.positiveCount));
table.put(put);
table.flushCommits()
})
and the error message:
15/10/23 19:06:42 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, gdc-dn126-formal.i.nease.net): java.lang.NoSuchMethodError:
com.google.common.io.ByteStreams.limit(Ljava/io/InputStream;J)Ljava/io/InputStream;
at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.nextBatchStream(ExternalAppendOnlyMap.scala:420)
at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.(ExternalAppendOnlyMap.scala:392)
at org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:207)
at org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:63)
at org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:83)
at org.apache.spark.util.collection.ExternalAppendOnlyMap.maybeSpill(ExternalAppendOnlyMap.scala:63)
at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:129)
at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:60)
at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:46)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/10/23 19:06:42 INFO TaskSetManager: Starting task 0.1 in stage 1.0 (TID 2, gdc-dn166-formal.i.nease.net, PROCESS_LOCAL, 1277
bytes)
15/10/23 19:06:42 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on gdc-dn166-formal.i.nease.net:3838
(size: 3.2 KB, free: 1060.3 MB)
15/10/23 19:06:42 ERROR YarnScheduler: Lost executor 1 on gdc-dn126-formal.i.nease.net: remote Rpc client disassociated
15/10/23 19:06:42 WARN ReliableDeliverySupervisor: Association with remote system
[akka.tcp://sparkExecutor#gdc-dn126-formal.i.nease.net:1656] has
failed, address is now gated for [5000] ms. Reason is:
[Disassociated].
15/10/23 19:06:42 INFO TaskSetManager: Re-queueing tasks for 1 from TaskSet 1.0
15/10/23 19:06:42 INFO DAGScheduler: Executor lost: 1 (epoch 1)
15/10/23 19:06:42 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
15/10/23 19:06:42 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, gdc-dn126-formal.i.nease.net, 44635)
15/10/23 19:06:42 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor
15/10/23 19:06:42 INFO ShuffleMapStage: ShuffleMapStage 0 is now unavailable on executor 1 (0/1, false)
15/10/23 19:06:42 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to
gdc-dn166-formal.i.nease.net:28595
15/10/23 19:06:42 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 84 bytes
15/10/23 19:06:42 WARN TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2, gdc-dn166-formal.i.nease.net): FetchFailed(null, shuffleId=1, mapId=-1, reduceId=0, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 1
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:389)
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:386)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:385)
at org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:172)
at org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)
at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
add
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>14.0.1</version>
</dependency>
because in https://guava.dev/releases/19.0/api/docs/src-html/com/google/common/io/ByteStreams.html#line.596
587 /**
588 * Wraps a {#link InputStream}, limiting the number of bytes which can be
589 * read.
590 *
591 * #param in the input stream to be wrapped
592 * #param limit the maximum number of bytes to be read
593 * #return a length-limited {#link InputStream}
594 * #since 14.0 (since 1.0 as com.google.common.io.LimitInputStream)
595 */
596 public static InputStream limit(InputStream in, long limit) {
597 return new LimitedInputStream(in, limit);
598 }
Related
we have 2.5TB data in hbase, and the region size is 5g or 10g, and the hbase table have 450 reigons. and we need transform to spark-sql. and the method used below:
1.snapshot hbase table.
2.read hfile by newHadoopAPIRDD
3.write to parquet.
val hconf = HBaseConfiguration.create()
hconf.set("hbase.rootdir", "/hbase")
hconf.set("hbase.zookeeper.quorum", HbaseToSparksqlBySnapshotParam.zookeeperQurum)
hconf.set(TableInputFormat.SCAN, convertScanToString(scan))
val job = Job.getInstance(hconf)
val path = new Path("/snapshot")
val snapshotName = HbaseToSparksqlBySnapshotParam.snapshotName
TableSnapshotInputFormat.setInput(job, snapshotName, path)
val hbaseRDD = spark.sparkContext.newAPIHadoopRDD(job.getConfiguration, classOf[TableSnapshotInputFormat],classOf[ImmutableBytesWritable], classOf[Result])
val rdd = hbaseRDD.map{
case(_,result) =>
...
Row
}
val df = spark.createDataFrame(rdd, schema)
df.write.parquet("/test")
num-executors executor-memory executor-cores run_time
16 6 2 error following
5 15 8 6hours
i don't know how to set the params(num-executors,executor-memory,executor-cores), and can run faster. when i just get one region(10g), i use the param as num-executors 1,executor-memory 3g,executor-cores 1, it run 14min.
i use spark2.1.0
error:
19/01/18 00:55:26 INFO TaskSetManager: Finished task 386.0 in stage 0.0 (TID 331) in 1187343 ms on hdh68 (executor 5) (331/470)
19/01/18 00:55:31 INFO TaskSetManager: Starting task 423.0 in stage 0.0 (TID 363, hdh68, executor 1, partition 423, NODE_LOCAL, 6905 bytes)
19/01/18 00:55:31 INFO TaskSetManager: Finished task 383.0 in stage 0.0 (TID 328) in 1427677 ms on hdh68 (executor 1) (332/470)
19/01/18 00:57:36 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 10.
19/01/18 00:57:36 INFO DAGScheduler: Executor lost: 10 (epoch 0)
19/01/18 00:57:36 INFO BlockManagerMasterEndpoint: Trying to remove executor 10 from BlockManagerMaster.
19/01/18 00:57:36 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(10, hdh68, 40679, None)
19/01/18 00:57:36 INFO BlockManagerMaster: Removed 10 successfully in removeExecutor
19/01/18 00:57:36 INFO DAGScheduler: Shuffle files lost for executor: 10 (epoch 0)
19/01/18 00:57:36 WARN DFSClient: Slow ReadProcessor read fields took 114044ms (threshold=30000ms); ack: seqno: 49 reply: 0 downstreamAckTimeNanos: 0, targets: [DatanodeInfoWithStorage[10.41.2.68:50010,DS-26a72d43-2e16-41a9-9a71-99593f14ab6f,DISK]]
19/01/18 00:57:39 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container killed by YARN for exceeding memory limits. 6.6 GB of 6.6 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
19/01/18 00:57:39 ERROR YarnScheduler: Lost executor 10 on hdh68: Container killed by YARN for exceeding memory limits. 6.6 GB of 6.6 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
19/01/18 00:57:39 INFO BlockManagerMasterEndpoint: Trying to remove executor 10 from BlockManagerMaster.
19/01/18 00:57:39 INFO BlockManagerMaster: Removal of executor 10 requested
19/01/18 00:57:39 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asked to remove non-existent executor 10
19/01/18 00:57:52 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.41.2.68:49771) with ID 17
19/01/18 00:57:52 INFO TaskSetManager: Starting task 387.1 in stage 0.0 (TID 364, hdh68, executor 17, partition 387, NODE_LOCAL, 6906 bytes)
19/01/18 00:57:52 INFO TaskSetManager: Starting task 417.1 in stage 0.0 (TID 365, hdh68, executor 17, partition 417, NODE_LOCAL, 6906 bytes)
19/01/18 00:57:53 INFO BlockManagerMasterEndpoint: Registering block manager hdh68:38247 with 3.0 GB RAM, BlockManagerId(17, hdh68, 38247, None)
19/01/18 00:58:42 INFO TaskSetManager: Starting task 426.0 in stage 0.0 (TID 366, hdh68, executor 13, partition 426, NODE_LOCAL, 6905 bytes)
19/01/18 00:58:42 INFO TaskSetManager: Finished task 396.0 in stage 0.0 (TID 341) in 1014645 ms on hdh68 (executor 13) (333/470)
java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:139)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:287)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:237)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:314)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:802)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:770)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1256)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:781)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:773)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:754)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:781)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:773)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:754)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:781)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:807)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:818)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:799)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:835)
at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1017)
at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:256)
at org.apache.spark.network.server.TransportRequestHandler.respond(TransportRequestHandler.java:194)
at org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:150)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
19/01/18 00:59:58 INFO TaskSetManager: Starting task 428.0 in stage 0.0 (TID 367, hdh68, executor 17, partition 428, NODE_LOCAL, 6906 bytes)
19/01/18 00:59:58 WARN TaskSetManager: Lost task 387.1 in stage 0.0 (TID 364, hdh68, executor 17): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:230)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1289)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:893)
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:691)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
19/01/18 01:00:04 INFO TaskSetManager: Starting task 387.2 in stage 0.0 (TID 368, hdh68, executor 14, partition 387, NODE_LOCAL, 6906 bytes)
19/01/18 01:00:04 INFO TaskSetManager: Finished task 385.0 in stage 0.0 (TID 330) in 1471628 ms on hdh68 (executor 14) (334/470)
19/01/18 01:00:12 INFO TaskSetManager: Starting task 429.0 in stage 0.0 (TID 369, hdh68, executor 8, partition 429, NODE_LOCAL, 6905 bytes)
19/01/18 01:00:12 INFO TaskSetManager: Finished task 392.0 in stage 0.0 (TID 337) in 1220925 ms on hdh68 (executor 8) (335/470)
19/01/18 01:00:27 INFO TaskSetManager: Starting task 430.0 in stage 0.0 (TID 370, hdh68, executor 11, partition 430, NODE_LOCAL, 6906 bytes)
19/01/18 01:01:16 INFO TaskSetManager: Finished task 390.0 in stage 0.0 (TID 335) in 1353136 ms on hdh68 (executor 14) (337/470)
19/01/18 01:01:30 INFO TaskSetManager: Starting task 432.0 in stage 0.0 (TID 372, hdh68, executor 5, partition 432, NODE_LOCAL, 6906 bytes)
19/01/18 01:01:30 INFO TaskSetManager: Finished task 393.0 in stage 0.0 (TID 338) in 1244707 ms on hdh68 (executor 5) (338/470)
java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:139)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:287)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:237)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:314)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:802)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:319)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:646)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
19/01/18 01:01:59 INFO TaskSetManager: Starting task 434.0 in stage 0.0 (TID 374, hdh68, executor 17, partition 434, NODE_LOCAL, 6905 bytes)
19/01/18 01:01:59 WARN TaskSetManager: Lost task 417.1 in stage 0.0 (TID 365, hdh68, executor 17): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:230)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1289)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:893)
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:691)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
update:
the cluser is a pseudo-distributed. Memory Tota 200g, VCores Total 64. and the root.root queue resource, the queue has not other application:
Used Resources: <memory:171776, vCores:44>
Num Active Applications: 7
Num Pending Applications: 0
Min Resources: <memory:0, vCores:0>
Max Resources: <memory:204800, vCores:64>
Steady Fair Share: <memory:54614, vCores:0>
Instantaneous Fair Share: <memory:102400, vCores:0>
Preemptable: true
I am using pyspark and got the following messages:
17/12/03 11:57:48 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 1800, 172.31.27.9, executor 0): java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:156)
at org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
at org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:367)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:366)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:366)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:361)
at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:736)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:342)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
17/12/03 11:57:48 INFO TaskSetManager: Starting task 0.1 in stage 5.0 (TID 1801, 172.31.27.9, executor 0, partition 0, PROCESS_LOCAL, 4871 bytes)
17/12/03 11:57:48 INFO TaskSetManager: Lost task 0.1 in stage 5.0 (TID 1801) on 172.31.27.9, executor 0: java.lang.AssertionError (assertion failed) [duplicate 1]
17/12/03 11:57:48 INFO TaskSetManager: Starting task 0.2 in stage 5.0 (TID 1802, 172.31.27.9, executor 0, partition 0, PROCESS_LOCAL, 4871 bytes)
17/12/03 11:57:48 INFO TaskSetManager: Lost task 0.2 in stage 5.0 (TID 1802) on 172.31.27.9, executor 0: java.lang.AssertionError (assertion failed) [duplicate 2]
17/12/03 11:57:48 INFO TaskSetManager: Starting task 0.3 in stage 5.0 (TID 1803, 172.31.27.9, executor 0, partition 0, PROCESS_LOCAL, 4871 bytes)
17/12/03 11:57:48 INFO TaskSetManager: Lost task 0.3 in stage 5.0 (TID 1803) on 172.31.27.9, executor 0: java.lang.AssertionError (assertion failed) [duplicate 3]
17/12/03 11:57:48 ERROR TaskSetManager: Task 0 in stage 5.0 failed 4 times; aborting job
17/12/03 11:57:48 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool
17/12/03 11:57:48 INFO TaskSchedulerImpl: Cancelling stage 5
17/12/03 11:57:48 INFO DAGScheduler: ResultStage 5 (runJob at PythonRDD.scala:446) failed in 0.078 s due to Job aborted due to stage failure: Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 1803, 172.31.27.9, executor 0): java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:156)
at org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
at org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:367)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:366)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:366)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:361)
at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:736)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:342)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
It seems it happened to others TaskSetManager: Task 0 in stage 5.0 failed 4 times; aborting job, but I am particularly concerned the messages: ... Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 1803, 172.31.27.9, executor 0): java.lang.AssertionError: assertion failed.
I doubt the problem happens when writing to RDD, there may be errors on write-in, and Spark has trouble to parse later.
I am using ALS library and the Rating object. So before save to RDD it is more convenient for me to map RDD to
data.map(lambda x: (x.user, x.product, x.rating)).saveAsTextFile ("hdfs://"+master_ip+":9000/RDD/data")
and read and parse as
data = sc.textFile("hdfs://"+master_ip+":9000/RDD/data")
data = data.map(lambda x: x[1:-1]).map(lambda x: x.split(", ")).\
map(lambda x: Rating(int(x[0]), int(x[1]), float(x[2])))
I am pretty curious since these error messages didn't appear every time and are not reproducible always.
I am using spark 2.2.0 and hapdoop 2.7. Did anyone see this before?
Thanks!
I am new to Cassandra Spark and trying to Load data from File to Cassandra Table using Spark master Cluster.
I am following the steps given in below link
http://docs.datastax.com/en/datastax_enterprise/4.7/datastax_enterprise/spark/sparkImportTxtCQL.html
On step no 8 the data is shown into Integer Array but when I am using the same command the result is shown into string Array[Array[String]] = Array(Array(6, 7, 8))
After applying the explicitly conversion method
For example
scala> val arr = Array("1", "12", "123")
arr: Array[String] = Array(1, 12, 123)
scala> val intArr = arr.map(_.toInt)
intArr: Array[Int] = Array(1, 12, 123)
the result is showing into this format
res24: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[7] at map at <console>:33
Now After retrieving data from it using take function or applying any function on it, the following errors are occurring
15/09/10 17:21:23 INFO SparkContext: Starting job: take at
:36 15/09/10 17:21:23 INFO DAGScheduler: Got job 23 (take at
:36) with 1 output partitions (allowLocal=true) 15/09/10
17:21:23 INFO DAGScheduler: Final stage: ResultStage 23(take at
:36) 15/09/10 17:21:23 INFO DAGScheduler: Parents of final
stage: List() 15/09/10 17:21:23 INFO DAGScheduler: Missing parents:
List() 15/09/10 17:21:23 INFO DAGScheduler: Submitting ResultStage 23
(MapPartitionsRDD[7] at map at :33), which has no missing
parents 15/09/10 17:21:23 INFO MemoryStore: ensureFreeSpace(3448)
called with curMem=411425, maxMem=257918238 15/09/10 17:21:23 INFO
MemoryStore: Block broadcast_25 stored as values in memory (estimated
size 3.4 KB, free 245.6 MB) 15/09/10 17:21:23 INFO MemoryStore:
ensureFreeSpace(2023) called with curMem=414873, maxMem=257918238
15/09/10 17:21:23 INFO MemoryStore: Block broadcast_25_piece0 stored
as bytes in memory (estimated size 2023.0 B, free 245.6 MB) 15/09/10
17:21:23 INFO BlockManagerInfo: Added broadcast_25_piece0 in memory on
192.168.1.137:57524 (size: 2023.0 B, free: 245.9 MB) 15/09/10 17:21:23 INFO SparkContext: Created broadcast 25 from broadcast at
DAGScheduler.scala:874 15/09/10 17:21:23 INFO DAGScheduler: Submitting
1 missing tasks from ResultStage 23 (MapPartitionsRDD[7] at map at
:33) 15/09/10 17:21:23 INFO TaskSchedulerImpl: Adding task
set 23.0 with 1 tasks 15/09/10 17:21:23 INFO TaskSetManager: Starting
task 0.0 in stage 23.0 (TID 117, 192.168.1.138, PROCESS_LOCAL, 1512
bytes) 15/09/10 17:21:23 INFO BlockManagerInfo: Added
broadcast_25_piece0 in memory on 192.168.1.138:34977 (size: 2023.0 B,
free: 265.4 MB) 15/09/10 17:21:23 WARN TaskSetManager: Lost task 0.0
in stage 23.0 (TID 117, 192.168.1.138):
java.lang.ClassNotFoundException:
$line67.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:274) at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
at org.apache.spark.scheduler.Task.run(Task.scala:70) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/09/10 17:21:23 INFO TaskSetManager: Starting task 0.1 in stage 23.0
(TID 118, 192.168.1.137, PROCESS_LOCAL, 1512 bytes) 15/09/10 17:21:23
INFO BlockManagerInfo: Added broadcast_25_piece0 in memory on
192.168.1.137:57296 (size: 2023.0 B, free: 265.4 MB) 15/09/10 17:21:23 INFO TaskSetManager: Lost task 0.1 in stage 23.0 (TID 118) on executor
192.168.1.137: java.lang.ClassNotFoundException ($line67.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1)
[duplicate 1] 15/09/10 17:21:23 INFO TaskSetManager: Starting task 0.2
in stage 23.0 (TID 119, 192.168.1.137, PROCESS_LOCAL, 1512 bytes)
15/09/10 17:21:23 INFO TaskSetManager: Lost task 0.2 in stage 23.0
(TID 119) on executor 192.168.1.137: java.lang.ClassNotFoundException
($line67.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1)
[duplicate 2] 15/09/10 17:21:23 INFO TaskSetManager: Starting task 0.3
in stage 23.0 (TID 120, 192.168.1.138, PROCESS_LOCAL, 1512 bytes)
15/09/10 17:21:23 INFO TaskSetManager: Lost task 0.3 in stage 23.0
(TID 120) on executor 192.168.1.138: java.lang.ClassNotFoundException
($line67.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1)
[duplicate 3] 15/09/10 17:21:23 ERROR TaskSetManager: Task 0 in stage
23.0 failed 4 times; aborting job 15/09/10 17:21:23 INFO TaskSchedulerImpl: Removed TaskSet 23.0, whose tasks have all
completed, from pool 15/09/10 17:21:23 INFO TaskSchedulerImpl:
Cancelling stage 23 15/09/10 17:21:23 INFO DAGScheduler: ResultStage
23 (take at :36) failed in 0.184 s 15/09/10 17:21:23 INFO
DAGScheduler: Job 23 failed: take at :36, took 0.194861 s
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_24_piece0
on 192.168.1.137:57524 in memory (size: 1963.0 B, free: 245.9 MB)
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_24_piece0
on 192.168.1.138:34977 in memory (size: 1963.0 B, free: 265.4 MB)
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_24_piece0
on 192.168.1.137:57296 in memory (size: 1963.0 B, free: 265.4 MB)
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_23_piece0
on 192.168.1.137:57524 in memory (size: 2.2 KB, free: 245.9 MB)
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_23_piece0
on 192.168.1.138:34977 in memory (size: 2.2 KB, free: 265.4 MB)
15/09/10 17:21:23 INFO BlockManagerInfo: Removed broadcast_23_piece0
on 192.168.1.137:57296 in memory (size: 2.2 KB, free: 265.4 MB)
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 23.0 failed 4 times, most recent failure: Lost task
0.3 in stage 23.0 (TID 120, 192.168.1.138): java.lang.ClassNotFoundException:
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:274) at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
at org.apache.spark.scheduler.Task.run(Task.scala:70) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace: at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236) at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Thanks in advance for help
It seems that you doesn't have the connection driver in your Classpath.
Look at this point:
java.lang.ClassNotFoundException:
at java.lang.Class.forName(Class.java:274)
Please, review your project and check if you have the Cassandra Connector in your dependencies.
I hope I've helped.
When I open the sparkR shell like below I am able to run the jobs successfully
>bin/sparkR
>rdf = data.frame(name =c("a", "b"), age =c(1,2))
>df = createDataFrame(sqlContext, rdf)
>df
DataFrame[name:string, age:double]
Wherease when I include the package spark-csv while loading the sparkR shell, the job fails
>bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3
>rdf = data.frame(name =c("a", "b"), age =c(1,2))
>df = createDataFrame(sqlContext, rdf)
> rdf = data.frame(name =c("a", "b"), age =c(1,2))
> df = createDataFrame(sqlContext, rdf)
15/06/25 17:59:50 INFO SparkContext: Starting job: collectPartitions at NativeMe
thodAccessorImpl.java:-2
15/06/25 17:59:50 INFO DAGScheduler: Got job 0 (collectPartitions at NativeMetho
dAccessorImpl.java:-2) with 1 output partitions (allowLocal=true)
15/06/25 17:59:50 INFO DAGScheduler: Final stage: ResultStage 0(collectPartition
s at NativeMethodAccessorImpl.java:-2)
15/06/25 17:59:50 INFO DAGScheduler: Parents of final stage: List()
15/06/25 17:59:50 INFO DAGScheduler: Missing parents: List()
15/06/25 17:59:50 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectio
nRDD[0] at parallelize at RRDD.scala:453), which has no missing parents
15/06/25 17:59:50 WARN SizeEstimator: Failed to check whether UseCompressedOops
is set; assuming yes
15/06/25 17:59:50 INFO MemoryStore: ensureFreeSpace(1280) called with curMem=0,
maxMem=280248975
15/06/25 17:59:50 INFO MemoryStore: Block broadcast_0 stored as values in memory
(estimated size 1280.0 B, free 267.3 MB)
15/06/25 17:59:50 INFO MemoryStore: ensureFreeSpace(854) called with curMem=1280
, maxMem=280248975
15/06/25 17:59:50 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in
memory (estimated size 854.0 B, free 267.3 MB)
15/06/25 17:59:50 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on l
ocalhost:55886 (size: 854.0 B, free: 267.3 MB)
15/06/25 17:59:50 INFO SparkContext: Created broadcast 0 from broadcast at DAGSc
heduler.scala:874
15/06/25 17:59:50 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage
0 (ParallelCollectionRDD[0] at parallelize at RRDD.scala:453)
15/06/25 17:59:50 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
15/06/25 17:59:50 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, lo
calhost, PROCESS_LOCAL, 1632 bytes)
15/06/25 17:59:50 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/06/25 17:59:50 INFO Executor: Fetching http://172.16.104.224:55867/jars/org.a
pache.commons_commons-csv-1.1.jar with timestamp 1435235242519
15/06/25 17:59:50 INFO Utils: Fetching http://172.16.104.224:55867/jars/org.apac
he.commons_commons-csv-1.1.jar to C:\Users\edwinn\AppData\Local\Temp\spark-39ef1
9de-03f7-4b45-b91b-0828912c1789\userFiles-d9b0cd7f-d060-4acc-bd26-46ce34d975b3\f
etchFileTemp3674233359629683967.tmp
15/06/25 17:59:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
702)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:465)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:398)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:390)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(
TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala
:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.s
cala:771)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor
$$updateDependencies(Executor.scala:390)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
15/06/25 17:59:50 **WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localh
ost): java.lang.NullPointerException**
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
702)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:465)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:398)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:390)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(
TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala
:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.s
cala:771)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor
$$updateDependencies(Executor.scala:390)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
15/06/25 17:59:50 ****
15/06/25 17:59:50 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
all completed, from pool
15/06/25 17:59:50 INFO TaskSchedulerImpl: Cancelling stage 0
15/06/25 17:59:50 INFO DAGScheduler: ResultStage 0 (collectPartitions at NativeM
ethodAccessorImpl.java:-2) failed in 0.156 s
15/06/25 17:59:50 INFO DAGScheduler: Job 0 failed: collectPartitions at NativeMe
thodAccessorImpl.java:-2, took 0.301876 s
15/06/25 17:59:50 **ERROR RBackendHandler: collectPartitions on 3 failed
java.lang.reflect.InvocationTargetException**
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandl
er.scala:127)
at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s
cala:74)
at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s
cala:36)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChanne
lInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra
ctChannelHandlerContext.java:319)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToM
essageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra
ctChannelHandlerContext.java:319)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessage
Decoder.java:163)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra
ctChannelHandlerContext.java:319)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChanne
lPipeline.java:787)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(Abstra
ctNioByteChannel.java:130)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.jav
a:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEve
ntLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.ja
va:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThread
EventExecutor.java:116)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorato
r.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Ta
sk 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.
0 (TID 0, localhost): java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
702)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:465)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:398)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor
$Executor$$updateDependencies$5.apply(Executor.scala:390)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(
TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala
:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.s
cala:771)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor
$$updateDependencies(Executor.scala:390)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DA
GScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(D
AGScheduler.scala:1257)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(D
AGScheduler.scala:1256)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.
scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala
:1256)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$
1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$
1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGSchedu
ler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAG
Scheduler.scala:1450)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAG
Scheduler.scala:1411)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
**Error: returnStatus == 0 is not TRUE**
>
I get the above error. Any Suggestions? Thanks.
I haven't used any cluster. I've set
>bin/SparkR --master local --packages com.databricks:spark-csv_2.10:1.0.3
My OS version is Windows 8 Enterprise, Spark 1.4.1, Scala 2.10.1, Spark-csv 2.11:1.0.3/2.10:1.0.3
Guys I have a strange problem...
I'm trying to train multiclass SVM classifier like this:
JavaPairRDD<Tuple2<String, String>, SVMModel> jp = scmap.mapToPair(new PairFunction<Tuple2<Tuple2<String, String>, RDD<LabeledPoint>>,Tuple2<String, String>, SVMModel >(){
#Override
public Tuple2<Tuple2<String, String>, SVMModel> call(Tuple2<Tuple2<String, String>, RDD<LabeledPoint>> tup)
{
SVMWithSGD svmAlg = new SVMWithSGD();
svmAlg.optimizer()
.setNumIterations(100)
.setRegParam(0.1)
.setUpdater(new SquaredL2Updater());
final SVMModel model = svmAlg.run(tup._2());
model.clearThreshold();
return new Tuple2<Tuple2<String, String>, SVMModel>(tup._1(), model);
}
});
But when I'm trying to collect() jp - I have this error:
15/01/16 20:06:30 WARN scheduler.TaskSetManager: Lost task 7.0 in stage 5.0 (TID 147, fujitsu11.in.nu): java.lang.NullPointerException:
org.apache.spark.rdd.ParallelCollectionRDD$.slice(ParallelCollectionRDD.scala:157)
org.apache.spark.rdd.ParallelCollectionRDD.getPartitions(ParallelCollectionRDD.scala:97)
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
scala.Option.getOrElse(Option.scala:120)
org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
org.apache.spark.rdd.RDD.take(RDD.scala:1060)
org.apache.spark.rdd.RDD.first(RDD.scala:1092)
org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:141)
maven.maven1.App$10.call(App.java:430)
maven.maven1.App$10.call(App.java:1)
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926)
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:926)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
scala.collection.AbstractIterator.to(Iterator.scala:1157)
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:774)
org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:774)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Starting task 7.1 in stage 5.0 (TID 148, fujitsu11.in.nu, PROCESS_LOCAL, 2219 bytes)
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Lost task 7.1 in stage 5.0 (TID 148) on executor fujitsu11.in.nu: java.lang.NullPointerException (null) [duplicate 1]
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Starting task 7.2 in stage 5.0 (TID 149, fujitsu11.in.nu, PROCESS_LOCAL, 2219 bytes)
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Lost task 7.2 in stage 5.0 (TID 149) on executor fujitsu11.in.nu: java.lang.NullPointerException (null) [duplicate 2]
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Starting task 7.3 in stage 5.0 (TID 150, fujitsu11.in.nu, PROCESS_LOCAL, 2219 bytes)
15/01/16 20:06:30 INFO scheduler.TaskSetManager: Lost task 7.3 in stage 5.0 (TID 150) on executor fujitsu11.in.nu: java.lang.NullPointerException (null) [duplicate 3]
15/01/16 20:06:30 ERROR scheduler.TaskSetManager: Task 7 in stage 5.0 failed 4 times; aborting job
15/01/16 20:06:30 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool
15/01/16 20:06:30 INFO scheduler.TaskSchedulerImpl: Cancelling stage 5
15/01/16 20:06:30 INFO scheduler.DAGScheduler: Failed to run collectAsMap at App.java:452
Why I get here NullPointer? I checked several times, that my
RDD<LabeledPoint>
and
Tuple2<String, String>
are not null. Maybe it's not capable to train classifier in parallel on workers?
Thank You.
You cannot run distributed operations inside distributed operations. But your first mapToPair need not be a distributed operation. Just .par.map a collection locally on the driver, each of which spawns a distributed operation to fit the model.