Spark sql throws java.lang.OutOfMemoryError in yarn cluster mode but works in yarn client mode - apache-spark

I have a simple hive query which works fine in yarn client mode using pyspark shell where as it throws me the below error when i run it in yarn-cluster mode.
Exception in thread "Thread-6"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-6"
Exception in thread "Reporter"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Reporter"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "sparkDriver-scheduler-1"
Cluster information: Hadoop 2.4, Spark 1.4.0-hadoop2.4 ,hive 0.13.1
The script takes 10 columns from a hive table and does some transformations and writes it to a file.
> num-executors 200 executor-memory 8G driver-memory 16G executor-cores 3
Full stack trace:
py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o62.javaToPython.
: java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
at java.lang.Class.getDeclaredMethods(Class.java:1855)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:206)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:132)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1891)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:683)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:682)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:682)
at org.apache.spark.api.python.SerDeUtil$.javaToPython(SerDeUtil.scala:140)
at org.apache.spark.sql.DataFrame.javaToPython(DataFrame.scala:1435)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)

java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(...
You are likely running out of "permanent generation" heap space in the driver's JVM. This area is used to store classes. When we run in cluster mode, the JVM needs to load more classes (I think this is because the Application Manager runs inside the same JVM as the driver). To increase the PermGen area, add the following option:
--driver-java-options -XX:MaxPermSize=256M
See also https://plumbr.eu/outofmemoryerror/permgen-space
When using HiveContext in your Python program, I've found that the following option is also needed:
--files /usr/hdp/current/spark-client/conf/hive-site.xml
See also https://community.hortonworks.com/questions/27239/executing-spark-submit-with-yarn-cluster-mode-and.html
I've also wanted to specify a specific version of Python to use, which requires another option:
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/local/bin/python2.7
See also https://issues.apache.org/jira/browse/SPARK-9235

Little addition to Mark's answer - sometimes Spark with HiveContext complains about OutOfMemoryError without any mention of PermGen, however only -XX:MaxPermSize helps.
So if you dealing with OOM when Spark + HiveContext is used, also try -XX:MaxPermSize

Related

Stop and Restart SparkContext executing in deploy mode "cluster"

In order to fit the efficiency requirements, I am forced to stop SparkContext and restart it with a new configuration more optimal in terms of number of executors, memory per executor, executor memory overhead...
I can achieve this launching my spark-submit in client mode :
spark-submit --num-executors 5 \
--deploy-mode client \
--class className spark.jar
And then within my code executing:
spark.stop()
val spark2 : SparkSession = SparkSession.builder
.config("spark.submit.deployMode", "client")
.config("spark.executor.instances", "8")
.getOrCreate()
And everything works OK.
However, when launching in client mode, stopping the SparkContext and restarting sparkContext in cluster mode, I get the following error:
20/05/28 18:05:24 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Detected yarn cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.
at org.apache.spark.SparkContext.<init>(SparkContext.scala:379)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
20/05/28 18:05:24 ERROR util.Utils: Uncaught exception in thread main
java.lang.NullPointerException
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$postApplicationEnd(SparkContext.scala:2416)
at org.apache.spark.SparkContext$$anonfun$stop$1.apply$mcV$sp(SparkContext.scala:1931)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1385)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1930)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:585)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I have also tried launching spark-submit in cluster mode, stopping SparkCOntext and restarting it again in cluster mode. In this case I get the error:
Exception in thread "main" org.apache.spark.SparkException: Application application_1583287354042_80626 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1171)
at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1608)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I am not sure if it might be related to the fact that the driver is running on the cluster...
I'd be very grateful if someone could provide a solution to achieve these requirements.

Unable to run Spark in yarn-cluster mode

I'm trying to run spark job with YARN in cluster deploy mode.
I tried to run the simpliest spark-submit command only with jar path, class parameter and master yarn-cluster. However I still have the same error, which actually tells me nothing.
Exception in thread "main" org.apache.spark.SparkException: Application application_1506196351647_0032 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1078)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1125)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
If anyone had the similar problem please let me know, I'm using spark 1.6, hadoop 2.6.

IllegalArgumentException with Spark 1.6

I'm running Spark 1.6.0 on CDH 5.7 and I've upgraded my original application from 1.4.1 to 1.6.0. When I try to run my application (which previously worked fine) I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:473)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:471)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:471)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:469)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:469)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:725)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1023)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1083)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I submit the application with:
--jars is a comma-separated list of jars (with absolute paths)
--files is a comma-separated list of files (with absolute paths)
--driver-class-path is a colon-separated list of resources (without the full path, just the file names)
I have tried it with full paths for the driver (and executor) class paths, but that gives me the same issue. All files and jars submitted with the app exist, I checked.
Could this be related to the issue with duplicates in the distributed cache or is this another issue?
From the source code I see that the only calls to require without a custom message (as in the stack trace) are related to the distribute() method. If so, how can I run applications without upgrading Spark?
This is the exception which results from having the same path/URI appear twice in the argument to the --files option.

JavaHiveContext inssue in spark

I used
spark 1.2.1
hadoop 2.3.0-cdh5.0.2
hbase 0.96.1.1-cdh5.0.2
when I run spark app, it always show the below exception.
Actually I have
org.apache.hbase
hbase-protocol
And protobuf-java 2.5.0 has been build to spark assemble jar.
spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, cfswps1d-phys.nam.nsroot.net): java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/protobuf/generated/MasterProtos$MasterService$BlockingInterface
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:186)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:377)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:366)
at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:247)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:188)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:150)
at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:130)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingInterface
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 23 more
You will need to add the classes you are referencing in your program that are mentioned in the error: $MasterService$BlockingInterface to your Spark Conf File:
spark.driver.extraClassPath=
You can read more about the extraClassPath option on the spark site.
Note: In client mode, this config must not be set through the SparkConf directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-class-path command line option or in your default properties file.="whatever jar file directory or class files"

Spark 1.1.0 on cdh5.1.3 does not work in yarn-cluster mode

I am having CDH 5.1 (Hadoop 2.3.0-cdh5.1.3) installed on my cluster, version:
I have installed and configured a prebuilt version of Spark 1.1.0 (Apache Version), built for hadoop 2.3 on my cluster.
when I run the Pi example in the ‘client mode’, it runs successfully, but it fails in the ‘yarn-cluster’ mode. The spark job is successfully submitted, but fails after polling the application master for sometime:
More Logs:
Application application_1415193640322_0016 failed 2 times due to Error launching appattempt_1415193640322_0016_000002. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: java.io.EOFException
at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:710)
at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:60)
at org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:95)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:151)
at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:142)
at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:262)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:696)
... 10 more
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:99)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:118)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:249)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnException): java.io.EOFException
at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:710)
at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagementProtocolPBServiceImpl.startContainers(ContainerManagementProtocolPBServiceImpl.java:60)
at org.apache.hadoop.yarn.proto.ContainerManagementProtocol$ContainerManagementProtocolService$2.callBlockingMethod(ContainerManagementProtocol.java:95)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:151)
at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:142)
at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:262)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:696)
... 10 more
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at org.apache.hadoop.ipc.Client.call(Client.java:1362)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy69.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
... 5 more
. Failing the application.
When I go to node Manager logs:
Log Type: stderr
Log Length: 87
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Can you please suggest any solution.Do you think I should compile the spark code on my cluster. Or should I use Spark provided with CDH5.1.
Any help will be appreciated!
spark-shell does not work with spark yarn-cluster mode. You should add --master yarn-client
Example:
path/to/pyspark --master yarn-client

Resources