Spark-Submit is throwing exception when running in yarn-cluster mode - apache-spark

i have a simple spark app for learning puprose ... this scala program parallelizr the data List and writes the RDD on a file in Hadoop.
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object HelloSpark {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("HelloSPark1").setMaster(args(0))
val sc = new SparkContext(conf)
val i = List(1,4,2,11,23,45,67,8,909,5,1,8,"agarwal",19,11,12,34,8031,"aditya")
val b = sc.parallelize(i,3)
b.saveAsTextFile(args(1))
}
}
i create a jar file and when i run it on my cluster it throws error when i run it as --master YARN and --deploy-mode cluster using following command
spark-submit --class "HelloSpark" --master yarn --deploy-mode cluster sparkappl_2.11-1.0.jar yarn /user
/letsbigdata9356/sparktest/run6
client token: N/A
diagnostics: Application application_1483332319047_3791 failed 2 times due to AM Container for appattempt_1483332319047_3791_000002 e
xited with exitCode: 15
For more detailed output, check application tracking page:http://a.cloudxlab.com:8088/cluster/app/application_1483332319047_3791Then, click on
links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e77_1483332319047_3791_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1484231621733
final status: FAILED
tracking URL: http://a.cloudxlab.com:8088/cluster/app/application_1483332319047_3791
user: letsbigdata9356
Exception in thread "main" org.apache.spark.SparkException: Application application_1483332319047_3791 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:974)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1020)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:685)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/12 14:34:10 INFO ShutdownHookManager: Shutdown hook called
but when i run it using following command in yarn-client mode or local mode it works fine
spark-submit --class "HelloSpark" sparkappl_2.11-1.0.jar yarn-client /user/letsbigdata9356/sparktest/run
5
or
spark-submit --class "HelloSpark" sparkappl_2.11-1.0.jar local /user/letsbigdata9356/sparktest/run
7
I am new to spark cloud you please help me resolving and learning about this issue.

Related

java.io.EOFException when using spark-submit with yarn as master on a cluster

I'm trying to run a jar file with this spark-submit command:
spark-submit --master yarn --deploy-mode cluster --executor-memory 3g --class my.package.Main my-jar-file.jar
The class Main is the jar's main class, and here's the contents (all in Scala):
object Main{
def main(args: Array[String]){
val server = HttpServer.create(new InetSocketAddress("master", 8000), 0)
val backend = new MainProcess()
val handlerRoot = new RootHandler()
handlerRoot.initProcess(backend)
server.createContext("/", handlerRoot)
server.setExecutor(null)
server.start()
println("Server is started at " + server.getAddress().getHostString() + ":" + server.getAddress().getPort())
}
}
The class MainProcess is the class where I do the stuff with Spark and Spark GraphX library using the files obtained from HDFS. This is how I configure the SparkContext in MainProcess class:
class MainProcess{
val config = new SparkConf()
config.setAppName("Final GraphX App - Main")
val sc = new SparkContext(config)
...
}
The app seems to be running okay and the final status returned a success, but the app simply closes instead of running continuously as it's supposed to be a running server. I can only open the link master:8000 once and it's back to unable to connect when I tried refreshing the page. Here's the log from running the app:
18/04/06 15:45:59 ERROR yarn.YarnAllocator: Failed to launch executor 2 on container container_1522920902032_0027_01_000003
org.apache.spark.SparkException: Exception while starting container container_1522920902032_0027_01_000003 on host slave2
at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:125)
at org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:65)
at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$runAllocatedContainers$1$$anon$1.run(YarnAllocator.scala:523)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "master/10.100.69.207"; destination host is: "slave2":57914;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy19.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy20.startContainers(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:201)
at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:122)
... 5 more
Caused by: java.io.IOException: java.io.EOFException
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:687)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:737)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 18 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:375)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:729)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
... 21 more
This app is basically a web app made using Java HTTP Server (com.sun.net.httpserver.HttpServer) and it uses Spark to process big data. The requests sent are accepted by the handler class and a new thread is made to run the Spark job on background. The user can send another request to check if the Spark job is finished, so the finished result can be shown to the web page. The problem is, the server is "killed" every time Spark claims to have finished a job (but in this case, failed a job).
I'm using Spark 2.2.0 built for Hadoop 2.7 and Hadoop 2.7.1. All data files are in HDFS.

Why does Spark on YARN in cluster mode fail with "Exception in thread "Driver" java.lang.NullPointerException"?

I'm using emr-5.4.0 with Spark 2.1.0. I understand what NullPointerException is, this question is about why that was thrown in this particular case.
Cannot really figure out why I got NullPointerException in the driver thread.
I got this weird job failing with this error:
18/03/29 20:07:52 INFO ApplicationMaster: Starting the user application in a separate Thread
18/03/29 20:07:52 INFO ApplicationMaster: Waiting for spark context initialization...
Exception in thread "Driver" java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
18/03/29 20:07:52 ERROR ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: SparkContext is null but app is still running!
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:415)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:766)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:764)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/03/29 20:07:52 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Deleting staging directory hdfs://<ip-address>.ec2.internal:8020/user/hadoop/.sparkStaging/application_1522348295743_0010
18/03/29 20:07:52 INFO ShutdownHookManager: Shutdown hook called
End of LogType:stderr
I submitted this job as this:
spark-submit --deploy-mode cluster --master yarn --num-executors 40 --executor-cores 16 --executor-memory 100g --driver-cores 8 --driver-memory 100g --class <package.class_name> --jars <s3://s3_path/some_lib.jar> <s3://s3_path/class.jar>
And my class looks like this:
class MyClass {
def main(args: Array[String]): Unit = {
val c = new MyClass()
c.process()
}
def process(): Unit = {
val sparkConf = new SparkConf().setAppName("my-test")
val sparkSession: SparkSession = SparkSession.builder().config(sparkConf).getOrCreate()
import sparkSession.implicits._
....
}
...
}
Change class MyClass to object MyClass and you're done.
While we're at it, I'd also change class MyClass to object MyClass extends App and remove def main(args: Array[String]): Unit (as given by extends App).
I've reported an improvement for Spark 2.3.0 - [SPARK-23830] Spark on YARN in cluster deploy mode fail with NullPointerException when a Spark application is a Scala class not object - to have it reported nicely to an end user.
Digging deeper into how Spark on YARN works, the following message is when the ApplicationMaster of a Spark application starts the driver (you used --deploy-mode cluster --master yarn with spark-submit).
ApplicationMaster: Starting the user application in a separate Thread
Right after the INFO message you should see another:
ApplicationMaster: Waiting for spark context initialization...
This is part of the driver initialization when the ApplicationMaster runs.
The reason for the exception Exception in thread "Driver" java.lang.NullPointerException is due to the following code:
val mainMethod = userClassLoader.loadClass(args.userClass)
.getMethod("main", classOf[Array[String]])
My understanding is that mainMethod is null at this point so the following line (where mainMethod is null) "triggers" NullPointerException:
mainMethod.invoke(null, userArgs.toArray)
The thread is indeed called Driver (as in Exception in thread "Driver" java.lang.NullPointerException) as set in this line:
userThread.setContextClassLoader(userClassLoader)
userThread.setName("Driver")
userThread.start()
The line numbers differ since I used Spark 2.3.0 to reference the lines while you use emr-5.4.0 with Spark 2.1.0.

Spark streaming - class not found - HDFS file streaming - java.lang.ClassNotFoundException: com.pepperdata.spark.metrics.PepperdataSparkListener

I have submitted the spark streaming job with the yarn cluster mode.
But I am getting the following error.
SparkSubmit Command:
export SPARK_CLASSPATH=/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar
spark-submit --master yarn-cluster --keytab /etc/security/keytabs/srvc_egsc_hdpuser.service.keytab --principal srvc_egsc_hdpuser#EAPKDC.HOUSTON.HP.COM --queue sc_streaming --class com.reni.scmplatform.data.producer.DPMain --executor-memory 5g --driver-memory 8g --conf spark.sql.shuffle.partitions=10 --conf spark.default.parallelism=50 --jars /usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --files /etc/spark/conf/hbase-site.xml,/etc/spark/conf/hive-site.xml hdfs://EAPROD/EA/supplychain/streaming/logistics/entaly/jars/DataProducer-assembly-1.0.15-SNAPSHOT.jar --platform.framework.hdfs.logging.dir=/EA/supplychain/process/logs/logistics/entaly/dataProducer --platform.framework.logging.level=info --platform.framework.logging.publish=true
Error:
18/03/12 05:14:30 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener
org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2154)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:578)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2280)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
at scala.Option.map(Option.scala:145)
at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:877)
at com.reni.scmplatform.data.producer.helper.DPStreamEventHandler.start(DPStreamEventHandler.scala:63)
at com.reni.scmplatform.data.producer.DPMain$.main(DPMain.scala:27)
at com.reni.scmplatform.data.producer.DPMain.main(DPMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:561)
Caused by: java.lang.ClassNotFoundException: com.pepperdata.spark.metrics.PepperdataSparkListener
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2122)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2119)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2119)
... 15 more
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
18/03/12 05:14:30 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener)
You should add the JAR containing the missing class to the job classpath by using the --jars option (see this answer: spark submit add multiple jars in classpath)
Moreover, I use sbt-assembly plugin to take care of these things for you:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
Then build with sbt compile assemble and all the jars needed for your application will be included in the job jar sent to Yarn.

Spark runs in local but can't find file when running in YARN

I've been trying to submit a simple python script to run it in a cluster with YARN. When I execute the job in local, there's no problem, everything works fine but when I run it in the cluster it fails.
I executed the submit with the following command:
spark-submit --master yarn --deploy-mode cluster test.py
The log error I'm receiving is the following one:
17/11/07 13:02:48 INFO yarn.Client: Application report for application_1510046813642_0010 (state: ACCEPTED)
17/11/07 13:02:49 INFO yarn.Client: Application report for application_1510046813642_0010 (state: ACCEPTED)
17/11/07 13:02:50 INFO yarn.Client: Application report for application_1510046813642_0010 (state: FAILED)
17/11/07 13:02:50 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1510046813642_0010 failed 2 times due to AM Container for appattempt_1510046813642_0010_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://myserver:8088/proxy/application_1510046813642_0010/Then, click on links to logs of each attempt.
**Diagnostics: File does not exist: hdfs://myserver:8020/user/josholsan/.sparkStaging/application_1510046813642_0010/test.py**
java.io.FileNotFoundException: File does not exist: hdfs://myserver:8020/user/josholsan/.sparkStaging/application_1510046813642_0010/test.py
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1266)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1258)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1258)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.josholsan
start time: 1510056155796
final status: FAILED
tracking URL: http://myserver:8088/cluster/app/application_1510046813642_0010
user: josholsan
Exception in thread "main" org.apache.spark.SparkException: Application application_1510046813642_0010 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1025)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1072)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/11/07 13:02:50 INFO util.ShutdownHookManager: Shutdown hook called
17/11/07 13:02:50 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-5cc8bf5e-216b-4d9e-b66d-9dc01a94e851
I put special attention to this line
Diagnostics: File does not exist: hdfs://myserver:8020/user/josholsan/.sparkStaging/application_1510046813642_0010/test.py
I don't know why it can't finde the test.py, I also tried to put it in HDFS under the directory of the user executing the job: /user/josholsan/
To finish my post I would like to share also my test.py script:
from pyspark import SparkContext
file="/user/josholsan/concepts_copy.csv"
sc = SparkContext("local","Test app")
textFile = sc.textFile(file).cache()
linesWithOMOP=textFile.filter(lambda line: "OMOP" in line).count()
linesWithICD=textFile.filter(lambda line: "ICD" in line).count()
print("Lines with OMOP: %i, lines with ICD9: %i" % (linesWithOMOP,linesWithICD))
Could the error also be in here?:
sc = SparkContext("local","Test app")
Thanks you so much for your help in advance.
Transferred from the comments section:
sc = SparkContext("local","Test app"): having "local" here will override any command line settings; from the docs:
Any values specified as flags or in the properties file will be passed on to the application and merged with those specified through SparkConf. Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file.
The test.py file must be placed somewhere where it is visible throughout the whole cluster. E.g. spark-submit --master yarn --deploy-mode cluster http://somewhere/accessible/to/master/and/workers/test.py
Any additional files and resources can be specified using the --py-files argument (tested in mesos, not in yarn unfortunately), e.g. --py-files http://somewhere/accessible/to/all/extra_python_code_my_code_uses.zip
Edit: as #desertnaut commented, this argument should be used before the script to be executed.
yarn logs -applicationId <app ID> will give you the output of your submitted job. More here and here
Hope this helps, good luck!

Spark Jobs crashing with ExitCodeException exitCode=15

I am running a very long spark job which crashes with the following error
Application application_1456200816465_347125 failed 2 times due to AM Container for appattempt_1456200816465_347125_000002 exited with exitCode: 15
For more detailed output, check application tracking page:http://foo.com:8088/proxy/application_1456200816465_347125/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e24_1456200816465_347125_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
I click on the link provided in the error message above and that shows me
java.io.IOException: Target log file already exists (hdfs://nameservice1/user/spark/applicationHistory/application_1456200816465_347125)
at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:201)
at org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1394)
at org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1394)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1394)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:107)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
If I restart the job it works fine for 1 hour or so and then again fails with this error. Note that hdfs://nameservice1/user/spark/applicationHistory/application_1456200816465_347125 is some system generated thing. this folder has nothing to do with my application.
I searched the internet and many people got this error because they were setting the master to local in their code. This is how I initialize my spark context
val conf = new SparkConf().setAppName("Foo")
val context = new SparkContext(conf)
context.hadoopConfiguration.set("mapreduce.input.fileinputformat.input.dir.recursive","true")
val sc = new SQLContext(context)
and I run my spark job like
sudo -u web nohup spark-submit --class com.abhi.Foo--master yarn-cluster
Foo-assembly-1.0.jar "2015-03-18" "2015-03-30" > fn_output.txt 2> fn_error.txt &

Resources