I cant get hive to run jobs with spark - apache-spark

Running hive with spark keeps giving me this error. I have tried many different versions of both hive and spark.
Running hadoop 2.7.3 in standalone mode
it is a diy implementation
right now using spark 2.2 with hive 2.3.5
I tried different versions of hive and spark, I am not sure what exactly the problem is or how to debug this:
0: jdbc:hive2://192.168.71.62:10000> select count(*) from traffic;
Getting log thread is interrupted, since query is done!
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark client.
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
... 11 more
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '9be4c047-285d-4578-a934-7bd51294d240'. Error: Child process exited before connecting back with error log Warning: Ignoring non-spark config property: hive.spark.client.server.connect.timeout=90000
Warning: Ignoring non-spark config property: hive.spark.client.rpc.threads=8
Warning: Ignoring non-spark config property: hive.spark.client.connect.timeout=1000
Warning: Ignoring non-spark config property: hive.spark.client.secret.bits=256
Warning: Ignoring non-spark config property: hive.spark.client.rpc.max.size=52428800
19/05/20 12:39:42 WARN util.Utils: Your hostname, suypc183-OptiPlex-3020 resolves to a loopback address: 127.0.0.1; using 192.168.71.62 instead (on interface enp2s0)
19/05/20 12:39:42 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/05/20 12:39:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/05/20 12:39:43 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
19/05/20 12:39:43 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
19/05/20 12:39:43 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
19/05/20 12:39:43 INFO yarn.Client: Setting up container launch context for our AM
19/05/20 12:39:43 INFO yarn.Client: Setting up the launch environment for our AM container
19/05/20 12:39:43 INFO yarn.Client: Preparing resources for our AM container
19/05/20 12:39:44 INFO yarn.Client: Deleted staging directory hdfs://localhost:9000/user/anonymous/.sparkStaging/application_1558334426394_0004
Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at org.apache.hadoop.fs.Path.<init>(Path.java:93)
at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:369)
at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:490)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:529)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:882)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1167)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1226)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:744)
at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:125)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:101)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:97)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:73)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
... 22 more
hive site configs are the following
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>Scratch space for Hive jobs</description>
</property>
<property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>spark.master</name>
<value>yarn</value>
</property>
<property>
<name>spark.executor.memory</name>
<value>2048</value>
</property>
<property>
<name>spark.yarn.archive</name>
<value>hdfs://localhost:8088/user/jars/</value>
</property>
<property>
<name>spark.home</name>
<value>/home/danielphingston/spark</value>
</property>

The most typical problem of connection between Hive and Spark is making sure that Spark knows the HADOOP CONF DIRECTORY. We solve that by provide below statements in the spark-env.sh file within Spark:
HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf}
HIVE_CONF_DIR=${HIVE_CONF_DIR:-/etc/hive/conf}
if [ -d "$HIVE_CONF_DIR" ]; then
HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HIVE_CONF_DIR"
fi
export HADOOP_CONF_DIR
This enables Spark to see where Hadoop directories are present in the file system.

Related

I want to run spark with yarn, but I get a java.net.ConnectException error

I want to run spark on yarn
root#server01:/export/server/spark# bin/spark-shell --master yarn
But it goes to error like this
root#server01:/export/server/spark# bin/spark-shell --master yarn
22/12/06 07:51:42 WARN Utils: Your hostname, server01 resolves to a loopback address: 127.0.1.1; using 192.168.40.133 instead (on interface ens33)
22/12/06 07:51:42 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/12/06 07:51:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/12/06 07:57:58 ERROR YarnClientSchedulerBackend: The YARN application has already ended! It might have been killed or the Application Master may have failed to start. Check the YARN application logs for more details.
22/12/06 07:57:58 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Application application_1670312235175_0001 failed 2 times due to Error launching appattempt_1670312235175_0001_000002. Got exception: java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:41647 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node1:8020/sparklog/
spark.eventLog.compress true
# spark-yarn jar package
spark.yarn.jars hdfs://node1:8020/spark/jars/*
# spark and yarn history server
spark.yarn.historyServer.address node1:18080
yarn-site.xml
<property>
<name>yarn.log.server.url</name>
<value>http://node1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
I verify the spark.yarn.historyServer to node1:19888 and restart spark,but it doesn't work.

spark-submit on local Hadoop-Yarn setup, fails with Stdout path must be absolute error

I have installed the latest Hadoop and Spark versions on my Windows machine.
I am trying to launch one of the provided examples but it fails and I have no idea what the diagnostic means. It seems it's related to the stdout but I can't figure out the root cause.
I launch the following command:
spark-submit --master yarn --class org.apache.spark.examples.JavaSparkPi C:\spark-3.0.1-bin-hadoop3.2\examples\jars\spark-examples_2.12-3.0.1.jar 100
And the exception I have is:
21/01/25 10:53:53 WARN MetricsSystem: Stopping a MetricsSystem that is not running
21/01/25 10:53:53 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/01/25 10:53:53 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Application application_1611568137841_0002 failed 2 times due to AM Container for appattempt_1611568137841_0002_000002 exited with exitCode: -1
Failing this attempt.Diagnostics:
[2021-01-25 10:53:53.381] Stdout path must be absolute
For more detailed output, check the application tracking page: http://xxxx-PC:8088/cluster/app/application_1611568137841_0002 Then click on links to logs of each attempt.
. Failing the application.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBack
end.scala:95)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:201)
at org.apache.spark.SparkContext.(SparkContext.scala:555)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2574)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:934)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:928)
at org.apache.spark.examples.JavaSparkPi.main(JavaSparkPi.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/01/25 10:53:53 INFO ShutdownHookManager: Shutdown hook called
21/01/25 10:53:53 INFO ShutdownHookManager: Deleting directory C:\Users\xxx\AppData\Local\Temp\spark-b28ecb32-5e3f-4d6a-973a-c03a7aae0da9
21/01/25 10:53:53 INFO ShutdownHookManager: Deleting directory C:\Users/xxx\AppData\Local\Temp\spark-3665ba77-d2aa-424a-9f75-e772bb5b9104
As for the diagnostics:
Diagnostics:
Application application_1611562870926_0004 failed 2 times due to AM Container for appattempt_1611562870926_0004_000002 exited with exitCode: -1
Failing this attempt.Diagnostics: [2021-01-25 10:29:19.734]Stdout path must be absolute
For more detailed output, check the application tracking page: http://****-PC:8088/cluster/app/application_1611562870926_0004 Then click on links to logs of each attempt.
. Failing the application.
Thank you !
So I am not sure of the root cause yet, it's probably due to the fact that I run under windows and some default property was wrong for Yarn.
When I added the 2 following properties on yarn-site.xml, it worked fine:
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/tmp</value>
</property>
<property>
<name>yarn.log.dir</name>
<value>/tmp</value>
</property>
Hope it helps someone in the future !

Failed to send RPC XXXX in spark-shell Hadoop 3.2.1 and spark 3.0.0

I am trying to run spark shell in psuedodistributed mode on my windows 10 pc having 8 Gigs of ram.
I am able to submit and run a mapreduce wordcount on yarn ,but when i try to initialize a spark shell or spark submit any program with master as yarn it fails with failed to send RPC error.
The error is given below.
Below is my yarn-site.xml config
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>C:\study\hadoop-3.2.1\data\nodemanager</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>127.0.0.1</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<!-- Site specific YARN configuration properties -->
</configuration>
By my initial investigation this seems to be caused by netty io library calling abstractRegion.transfer() method in spark network utils which doesnt seems to be present...
Below is complete error.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/study/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/study/hadoop-3.2.1/data/nodemanager/usercache/Administrator/appcache/application_1609008428682_0006/container_1609008428682_0006_01_000001/__spark_libs__/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-12-27 01:27:52,370 WARN util.Shell: Did not find winutils.exe: {}
java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:548)
at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:569)
at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:592)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:689)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
at org.apache.hadoop.yarn.conf.YarnConfiguration.<clinit>(YarnConfiguration.java:1159)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:858)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:921)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:468)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:439)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:516)
... 5 more
2020-12-27 01:27:52,776 INFO spark.SecurityManager: Changing view acls to: Administrator
2020-12-27 01:27:52,777 INFO spark.SecurityManager: Changing modify acls to: Administrator
2020-12-27 01:27:52,778 INFO spark.SecurityManager: Changing view acls groups to:
2020-12-27 01:27:52,779 INFO spark.SecurityManager: Changing modify acls groups to:
2020-12-27 01:27:52,780 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator); groups with view permissions: Set(); users with modify permissions: Set(Administrator); groups with modify permissions: Set()
2020-12-27 01:27:53,417 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1609008428682_0006_000001
2020-12-27 01:27:54,627 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8030
2020-12-27 01:27:54,727 INFO yarn.YarnRMClient: Registering the ApplicationMaster
2020-12-27 01:27:55,305 INFO client.TransportClientFactory: Successfully created connection to LAPTOP-GQ2OL7O9/192.168.0.106:56588 after 137 ms (0 ms spent in bootstraps)
2020-12-27 01:27:55,341 ERROR client.TransportClient: Failed to send RPC RPC 6402554451456766428 to LAPTOP-GQ2OL7O9/192.168.0.106:56588: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
2020-12-27 01:27:55,353 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:302)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:547)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:266)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:890)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:889)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:889)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:921)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 6402554451456766428 to LAPTOP-GQ2OL7O9/192.168.0.106:56588: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:363)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:340)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
2020-12-27 01:27:55,357 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:302)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:547)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:266)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:890)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:889)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:889)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:921)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: java.io.IOException: Failed to send RPC RPC 6402554451456766428 to LAPTOP-GQ2OL7O9/192.168.0.106:56588: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at org.apache.spark.network.client.TransportClient$RpcChannelListener.handleFailure(TransportClient.java:363)
at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:340)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.ChannelOutboundBuffer.safeFail(ChannelOutboundBuffer.java:680)
at io.netty.channel.ChannelOutboundBuffer.remove0(ChannelOutboundBuffer.java:294)
at io.netty.channel.ChannelOutboundBuffer.failFlushed(ChannelOutboundBuffer.java:617)
at io.netty.channel.AbstractChannel$AbstractUnsafe.closeOutboundBufferForShutdown(AbstractChannel.java:627)
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:620)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:893)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:313)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:847)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1264)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:743)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:770)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:762)
at io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:35)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1050)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.socket.ChannelOutputShutdownException: Channel output shutdown
at io.netty.channel.AbstractChannel$AbstractUnsafe.shutdownOutput(AbstractChannel.java:587)
... 22 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
)
2020-12-27 01:27:55,368 INFO util.ShutdownHookManager: Shutdown hook called
There seems to be no help on internet for my cause...
Thanks in advance.
Caused by: java.lang.NoSuchMethodError: org.apache.spark.network.util.AbstractFileRegion.transferred()J
at org.apache.spark.network.util.AbstractFileRegion.transfered(AbstractFileRegion.java:28)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:228)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:282)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:879)
... 21 more
This seems like you may have multiple version on the classpath. Ensure you only have one version on the classpath (and this needs to be the right one).

Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

I have this submitted job fine in YARN mode but I get the following error
I added a Spark jar installed locally, but the Spark jar can also be in a world-readable location on HDFS. This allows YARN to cache it on nodes and added .bashrc yarn_config_dir and hadoop_config_dir.
ERROR:
Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
9068548 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1531990849146_0010 (state: FAILED)
9068548 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client -
client token: N/A
diagnostics: Application application_1531990849146_0010 failed 2 times due to AM Container for appattempt_1531990849146_0010_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2018-07-19 11:56:58.484]Exception from container-launch.
Container id: container_1531990849146_0010_02_000001
Exit code: 1
[2018-07-19 11:56:58.484]
[2018-07-19 11:56:58.486]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
[2018-07-19 11:56:58.486]
[2018-07-19 11:56:58.486]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
[2018-07-19 11:56:58.486]
For more detailed output, check the application tracking page: http://localhost:8088/cluster/app/application_1531990849146_0010 Then click on links to logs of each attempt.
. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1532001413903
final status: FAILED
tracking URL: http://localhost:8088/cluster/app/application_1531990849146_0010
user: root
9068611 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client - Deleted staging directory file:/root/.sparkStaging/application_1531990849146_0010
9068612 [bioingine-management-service-akka.actor.default-dispatcher-14] ERROR org.apache.spark.SparkContext - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
9068618 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.s.jetty.server.AbstractConnector - Stopped Spark#2152c728{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
9068619 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.ui.SparkUI - Stopped Spark web UI at http://localhost:4040
9068620 [dispatcher-event-loop-16] WARN o.a.s.s.c.YarnSchedulerBackend$YarnSchedulerEndpoint - Attempted to request executors before the AM has registered!
9068621 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.YarnClientSchedulerBackend - Shutting down all executors
9068621 [dispatcher-event-loop-17] INFO o.a.s.s.c.YarnSchedulerBackend$YarnDriverEndpoint - Asking each executor to shut down
9068622 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.SchedulerExtensionServices - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
9068622 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.YarnClientSchedulerBackend - Stopped
9068623 [dispatcher-event-loop-20] INFO o.a.s.MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
9068624 [bioingine-management-service-akka.actor.default-dispatcher-14] ERROR org.apache.spark.util.Utils - Uncaught exception in thread bioingine-management-service-akka.actor.default-dispatcher-14
java.lang.NullPointerException: null
at org.apache.spark.network.shuffle.ExternalShuffleClient.close(ExternalShuffleClient.java:141)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1485)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:90)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1937)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1317)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1936)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:587)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
9068624 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext
9068627 [bioingine-management-service-akka.actor.default-dispatcher-15] ERROR akka.actor.ActorSystemImpl - Error during processing of request: 'Yarn application has already ended! It might have been killed or unable to launch application master.'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
**SparkSession configuration**
new SparkConf().setMaster("yarn").setAppName("Test").set("spark.executor.memory", "3g")
.set("spark.ui.enabled","true")
.set("spark.driver.memory","9g")
.set("spark.default.parallelism","10")
.set("spark.executor.cores","3")
.set("spark.cores.max","9")
.set("spark.memory.offHeap.enabled","true")
.set("spark.memory.offHeap.size","6g")
.set("spark.yarn.am.memory","2g")
.set("spark.yarn.am.cores","2")
.set("spark.yarn.am.cores","2")
.set("spark.yarn.archive","hdfs://localhost:9000/user/spark/share/lib/spark2-hdp-yarn-archive.tar.gz")
.set("spark.yarn.jars","hdfs://localhost:9000/user/spark/share/lib/spark-yarn_2.11.2.2.0.jar")
**We are added below configuration**
1. These entries are in $SPARK_HOME/conf/spark-defaults.conf
spark.driver.extraJavaOptions -Dhdp.version=2.9.0
spark.yarn.am.extraJavaOptions -Dhdp.version=2.9.0
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
2. yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>/usr/share/hadoop/etc/hadoop,/usr/share/hadoop/,/usr/share/hadoop/lib/,/usr/share/hadoop/share/hadoop/common/,/usr/share/hadoop/share/hadoop/common/lib, /usr/share/hadoop/share/hadoop/hdfs/,/usr/share/hadoop/share/hadoop/hdfs/lib/,/usr/share/hadoop/share/hadoop/mapreduce/,/usr/share/hadoop/share/hadoop/mapreduce/lib/,/usr/share/hadoop/share/hadoop/tools/lib/,/usr/share/hadoop/share/hadoop/yarn/,/usr/share/hadoop/share/hadoop/yarn/lib/*,/usr/share/spark/jars/spark-yarn_2.11-2.2.0.jar</value>
</property>
</configuration>
3.Spark-env.sh
export HADOOP_CONF_DIR=/home/hadoop/hadoop/etc/hadoop
export SPARK_HOME=/home/hadoop/spark
SPARK_DIST_CLASSPATH="/usr/share/spark/jars/*"
4. .bashrc
export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/"
export SBT_OPTS="-Xms16G -Xmx16G"
export HADOOP_INSTALL=/usr/share/hadoop
export HADOOP_CONF_DIR=/usr/share/hadoop/etc/hadoop/
export YARN_CONF_DIR=/usr/share/hadoop/etc/hadoop/
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export SPARK_CLASSPATH="/usr/share/spark/jars/*"
export SPARK_HOME="/usr/share/spark/"
export PATH=$PATH:$SPARK_HOME

Property spark.yarn.jars - how to deal with it?

My knowledge with Spark is limited and you would sense it after reading this question. I have just one node and spark, hadoop and yarn are installed on it.
I was able to code and run word-count problem in cluster mode by below command
spark-submit --class com.sanjeevd.sparksimple.wordcount.JobRunner
--master yarn
--deploy-mode cluster
--driver-memory=2g
--executor-memory 2g
--executor-cores 1
--num-executors 1
SparkSimple-0.0.1SNAPSHOT.jar
hdfs://sanjeevd.br:9000/user/spark-test/word-count/input
hdfs://sanjeevd.br:9000/user/spark-test/word-count/output
It works just fine.
Now I understood that 'spark on yarn' requires spark jar files available on the cluster and if I don't do anything then every time I run my program it will copy hundreds of jar files from $SPARK_HOME to each node (in my case it's just one node). I see that code's execution pauses for some time before it finishes copying. See below -
16/12/12 17:24:03 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/12/12 17:24:06 INFO yarn.Client: Uploading resource file:/tmp/spark-a6cc0d6e-45f9-4712-8bac-fb363d6992f2/__spark_libs__11112433502351931.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/__spark_libs__11112433502351931.zip
16/12/12 17:24:08 INFO yarn.Client: Uploading resource file:/home/sanjeevd/personal/Spark-Simple/target/SparkSimple-0.0.1-SNAPSHOT.jar -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/SparkSimple-0.0.1-SNAPSHOT.jar
16/12/12 17:24:08 INFO yarn.Client: Uploading resource file:/tmp/spark-a6cc0d6e-45f9-4712-8bac-fb363d6992f2/__spark_conf__6716604236006329155.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0001/__spark_conf__.zip
Spark's documentation suggests to set spark.yarn.jars property to avoid this copying. So I set below below property in spark-defaults.conf file.
spark.yarn.jars hdfs://sanjeevd.br:9000//user/spark/share/lib
http://spark.apache.org/docs/latest/running-on-yarn.html#preparations
To make Spark runtime jars accessible from YARN side, you can specify spark.yarn.archive or spark.yarn.jars. For details please refer to Spark Properties. If neither spark.yarn.archive nor spark.yarn.jars is specified, Spark will create a zip file with all jars under $SPARK_HOME/jars and upload it to the distributed cache.
Btw, I have all the jar files from LOCAL /opt/spark/jars to HDFS /user/spark/share/lib. They are 206 in number.
This makes my jar failed. Below is the error -
spark-submit --class com.sanjeevd.sparksimple.wordcount.JobRunner --master yarn --deploy-mode cluster --driver-memory=2g --executor-memory 2g --executor-cores 1 --num-executors 1 SparkSimple-0.0.1-SNAPSHOT.jar hdfs://sanjeevd.br:9000/user/spark-test/word-count/input hdfs://sanjeevd.br:9000/user/spark-test/word-count/output
16/12/12 17:43:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/12 17:43:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/12/12 17:43:07 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/12/12 17:43:07 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (5120 MB per container)
16/12/12 17:43:07 INFO yarn.Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
16/12/12 17:43:07 INFO yarn.Client: Setting up container launch context for our AM
16/12/12 17:43:07 INFO yarn.Client: Setting up the launch environment for our AM container
16/12/12 17:43:07 INFO yarn.Client: Preparing resources for our AM container
16/12/12 17:43:07 INFO yarn.Client: Uploading resource file:/home/sanjeevd/personal/Spark-Simple/target/SparkSimple-0.0.1-SNAPSHOT.jar -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005/SparkSimple-0.0.1-SNAPSHOT.jar
16/12/12 17:43:07 INFO yarn.Client: Uploading resource file:/tmp/spark-fae6a5ad-65d9-4b64-9ba6-65da1310ae9f/__spark_conf__7881471844385719101.zip -> hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005/__spark_conf__.zip
16/12/12 17:43:08 INFO spark.SecurityManager: Changing view acls to: sanjeevd
16/12/12 17:43:08 INFO spark.SecurityManager: Changing modify acls to: sanjeevd
16/12/12 17:43:08 INFO spark.SecurityManager: Changing view acls groups to:
16/12/12 17:43:08 INFO spark.SecurityManager: Changing modify acls groups to:
16/12/12 17:43:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(sanjeevd); groups with view permissions: Set(); users with modify permissions: Set(sanjeevd); groups with modify permissions: Set()
16/12/12 17:43:08 INFO yarn.Client: Submitting application application_1481592214176_0005 to ResourceManager
16/12/12 17:43:08 INFO impl.YarnClientImpl: Submitted application application_1481592214176_0005
16/12/12 17:43:09 INFO yarn.Client: Application report for application_1481592214176_0005 (state: ACCEPTED)
16/12/12 17:43:09 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1481593388442
final status: UNDEFINED
tracking URL: http://sanjeevd.br:8088/proxy/application_1481592214176_0005/
user: sanjeevd
16/12/12 17:43:10 INFO yarn.Client: Application report for application_1481592214176_0005 (state: FAILED)
16/12/12 17:43:10 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1481592214176_0005 failed 1 times due to AM Container for appattempt_1481592214176_0005_000001 exited with exitCode: 1
For more detailed output, check application tracking page:http://sanjeevd.br:8088/cluster/app/application_1481592214176_0005Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1481592214176_0005_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1481593388442
final status: FAILED
tracking URL: http://sanjeevd.br:8088/cluster/app/application_1481592214176_0005
user: sanjeevd
16/12/12 17:43:10 INFO yarn.Client: Deleting staging directory hdfs://sanjeevd.br:9000/user/sanjeevd/.sparkStaging/application_1481592214176_0005
Exception in thread "main" org.apache.spark.SparkException: Application application_1481592214176_0005 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1175)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/12/12 17:43:10 INFO util.ShutdownHookManager: Shutdown hook called
16/12/12 17:43:10 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-fae6a5ad-65d9-4b64-9ba6-65da1310ae9f
Do you know what wrong am I doing? The task's log says below -
Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster
I understand the error that ApplicationMaster class is not found but my question is why it is not found - where this class is supposed to be? I don't have assembly jar since I'm using spark 2.0.1 where there is no assembly comes bundled.
What this has to do with spark.yarn.jars property? This property is to help spark run on yarn, and that should be it. What additional I need to do when using spark.yarn.jars?
Thanks in reading this question and for your help in advance.
You could also use the spark.yarn.archive option and set that to the location of an archive (you create) containing all the JARs in the $SPARK_HOME/jars/ folder, at the root level of the archive. For example:
Create the archive: jar cv0f spark-libs.jar -C $SPARK_HOME/jars/ .
Upload to HDFS: hdfs dfs -put spark-libs.jar /some/path/.
2a. For a large cluster, increase the replication count of the Spark archive so that you reduce the amount of times a NodeManager will do a remote copy. hdfs dfs –setrep -w 10 hdfs:///some/path/spark-libs.jar (Change the amount of replicas proportional to the number of total NodeManagers)
Set spark.yarn.archive to hdfs:///some/path/spark-libs.jar
I was finally able to make sense of this property. I found by hit-n-trial that correct syntax of this property is
spark.yarn.jars=hdfs://xx:9000/user/spark/share/lib/*.jar
I didn't put *.jar in the end and my path was just ended with /lib. I tried putting actual assembly jar like this - spark.yarn.jars=hdfs://sanjeevd.brickred:9000/user/spark/share/lib/spark-yarn_2.11-2.0.1.jar but no luck. All it said that unable to load ApplicationMaster.
I posted my response to a similar question asked by someone at https://stackoverflow.com/a/41179608/2332121
If you look at spark.yarn.jars documentation it says the following
List of libraries containing Spark code to distribute to YARN containers. By default, Spark on YARN will use Spark jars installed locally, but the Spark jars can also be in a world-readable location on HDFS. This allows YARN to cache it on nodes so that it doesn't need to be distributed each time an application runs. To point to jars on HDFS, for example, set this configuration to hdfs:///some/path. Globs are allowed.
This means that you are actually overriding the SPARK_HOME/jars and telling yarn to pick up all the jars required for the application run from your path,If you set spark.yarn.jars property, all the dependent jars for spark to run should be present in this path, If you go and look inside spark-assembly.jar present in SPARK_HOME/lib , org.apache.spark.deploy.yarn.ApplicationMaster class is present, so make sure that all the spark dependencies are present in the HDFS path that you specify as spark.yarn.jars.

Resources