ERROR : User did not initialize spark context - apache-spark

Log error :
TestSuccessfull
2018-08-20 04:52:15 INFO ApplicationMaster:54 - Final app status: FAILED, exitCode: 13
2018-08-20 04:52:15 ERROR ApplicationMaster:91 - Uncaught exception:
java.lang.IllegalStateException: User did not initialize spark context!
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:498)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:345)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:800)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:799)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:824)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
2018-08-20 04:52:15 INFO SparkContext:54 - Invoking stop() from shutdown hook
Error log on console After submit command :
2018-08-20 05:47:35 INFO Client:54 - Application report for application_1534690018301_0035 (state: ACCEPTED)
2018-08-20 05:47:36 INFO Client:54 - Application report for application_1534690018301_0035 (state: ACCEPTED)
2018-08-20 05:47:37 INFO Client:54 - Application report for application_1534690018301_0035 (state: FAILED)
2018-08-20 05:47:37 INFO Client:54 -
client token: N/A
diagnostics: Application application_1534690018301_0035 failed 2 times due to AM Container for appattempt_1534690018301_0035_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2018-08-20 05:47:36.454]Exception from container-launch.
Container id: container_1534690018301_0035_02_000001
Exit code: 13
My code :
val sparkConf = new SparkConf().setAppName("Gathering Data")
val sc = new SparkContext(sparkConf)
submit command :
spark-submit --class spark_basic.Test_Local --master yarn --deploy-mode cluster /home/IdeaProjects/target/Spark-1.0-SNAPSHOT.jar
discription :
I have installed spark on hadoop in psedo distribustion mode.
spark-shell working fine. only problem when i used cluster mode .
My code also work file . i am able print output but at final its giving error .

I presume your lines of code has a line which sets master to local.
SparkConf.setMaster("local[*]")
if so, try to comment out that line and try again as you will be setting the master to yarn in your command
/usr/cdh/current/spark-client/bin/spark-submit --class com.test.sparkApp --master yarn --deploy-mode cluster --num-executors 40 --executor-cores 4 --driver-memory 17g --executor-memory 22g --files /usr/cdh/current/spark-client/conf/hive-site.xml /home/user/sparkApp.jar

Finally i got with
spark-submit
/home/mahendra/Marvaland/SparkEcho/spark-2.3.0-bin-hadoop2.7/bin/spark-submit --master yarn --class spark_basic.Test_Local /home/mahendra/IdeaProjects/SparkTraining/target/SparkTraining-1.0-SNAPSHOT.jar
spark session
val spark = SparkSession.builder()
.appName("DataETL")
.master("local[1]")
.enableHiveSupport()
.getOrCreate()
thanks #cricket_007

This error may occur if you are submitting the spark job like this:
spark-submit --class some.path.com.Main --master yarn --deploy-mode cluster some_spark.jar (with passing master and deploy-mode as argument in CLI) and at the same time having this line: new SparkContext in your code.
Either get the context with val sc = SparkContext.getOrCreate() or do not pass the spark-submit master and deploy-mode arguments if want to have new SparkContext.

Related

Spark application exits with "ERROR root: EAP#5: Application configuration file is missing" before spark context initialization

i'm trying to execute a spark job, using the following spark-submit command-
spark-submit --class package.Classname --queue QueueName
--executor-cores 2 --master yarn --deploy-mode cluster --executor-memory 8G --num-executors 20 --driver-memory 10G --conf "spark.yarn.executor.memoryOverhead=3G" --conf
"spark.speculation=true" --conf "spark.network.timeout=600" --conf
"spark.rpc.askTimeout=600s" --conf
"spark.executor.heartbeatInterval=120s" --conf
"spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties"
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties"
--conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" --conf "spark.root.logger=ALL,console" --conf "spark.hadoop.validateOutputSpecs=false" --conf
"spark.driver.extraClassPath=/home/tumulusr/spark-defaults.conf"
--files /etc/spark2/2.6.4.0-91/0/hive-site.xml,config/ibnext.properties,config/hive.properties,config/mongo.properties,config/error.properties
/home/tumulusr/pn.jar
the application gets accepted but soon it exits with the following error:
ERROR root: EAP#5: Application configuration file is missing
INFO ApplicationMaster: Final app status: FAILED, exitCode: 16, (reason: Shutdown hook called before final status was reported.)
INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Shutdown hook called before final status was reported.)
INFO ApplicationMaster: Deleting staging directory (directory path with application id)
INFO ShutdownHookManager: Shutdown hook called
am i missing anything my spark-submit command?
Can you try the below :
export SPARK_MAJOR_VERSION=2;export HADOOP_CONF_DIR=**/hadoop/conf/path**; spark-submit

Fail to submit spark job

I am trying to run the Spark-solr Twitter example with spark-solr-3.4.4-shaded.jar,
bin/spark-submit --master local[2] \ --conf "spark.driver.extraJavaOptions=-Dtwitter4j.oauth.consumerKey=?
-Dtwitter4j.oauth.consumerSecret=? -Dtwitter4j.oauth.accessToken=? -Dtwitter4j.oauth.accessTokenSecret=?" \ --class com.lucidworks.spark.SparkApp \ ./target/spark-solr-3.1.1-shaded.jar \ twitter-to-solr -zkHost localhost:9983 -collection socialdata
but it is failed and the following message is shown
INFO ContextHandler: Started o.e.j.s.ServletContextHandler#29182679{/metrics/json,null,AVAILABLE,#Spark}
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.SparkContext.jobProgressListener()Lorg/apache/spark/ui/jobs/JobProgressListener;
I can confirm the path for ./target/spark-solr-3.1.1-shaded.jar is correct.
I suspect there is something wrong in --class com.lucidworks.spark.SparkApp (ClassPath), but I am not sure.
I am running on local mode and I change the parameters as instructed in the example.
Version:
Spark 2.1.1
Spark-solr 3.1.1
Solr 6.6.0

Spark Job Container exited with exitCode: -1000

I have been struggling to run sample job with spark 2.0.0 in yarn cluster mode, job exists with exitCode: -1000 without any other clues. Same job runs properly in local mode.
Spark command:
spark-submit \
--conf "spark.yarn.stagingDir=/xyz/warehouse/spark" \
--queue xyz \
--class com.xyz.TestJob \
--master yarn \
--deploy-mode cluster \
--conf "spark.local.dir=/xyz/warehouse/tmp" \
/xyzpath/java-test-1.0-SNAPSHOT.jar $#
TestJob class:
public class TestJob {
public static void main(String[] args) throws InterruptedException {
SparkConf conf = new SparkConf();
JavaSparkContext jsc = new JavaSparkContext(conf);
System.out.println(
"TOtal count:"+
jsc.parallelize(Arrays.asList(new Integer[]{1,2,3,4})).count());
jsc.stop();
}
}
Error Log:
17/10/04 22:26:52 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED)
17/10/04 22:26:52 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.xyz
start time: 1507181210893
final status: UNDEFINED
tracking URL: http://xyzserver:8088/proxy/application_1506717704791_130756/
user: xyz
17/10/04 22:26:53 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED)
17/10/04 22:26:54 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED)
17/10/04 22:26:55 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED)
17/10/04 22:26:56 INFO Client: Application report for application_1506717704791_130756 (state: FAILED)
17/10/04 22:26:56 INFO Client:
client token: N/A
diagnostics: Application application_1506717704791_130756 failed 5 times due to AM Container for appattempt_1506717704791_130756_000005 exited with exitCode: -1000
For more detailed output, check application tracking page:http://xyzserver:8088/cluster/app/application_1506717704791_130756Then, click on links to logs of each attempt.
Diagnostics: Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.xyz
start time: 1507181210893
final status: FAILED
tracking URL: http://xyzserver:8088/cluster/app/application_1506717704791_130756
user: xyz
17/10/04 22:26:56 INFO Client: Deleted staging directory /xyz/spark/.sparkStaging/application_1506717704791_130756
Exception in thread "main" org.apache.spark.SparkException: Application application_1506717704791_130756 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1167)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1213)
When I browse the page http://xyzserver:8088/cluster/app/application_1506717704791_130756 it doesn't exists.
No Yarn application logs found-
$yarn logs -applicationId application_1506717704791_130756
/apps/yarn/logs/xyz/logs/application_1506717704791_130756 does not have any log files.
What could be the possibly rootcause of this error and how to get detailed error logs?
After spending nearly one whole day I found the rootcause. When I remove spark.yarn.stagingDir it starts working and I am still not sure why spark is complaining about it-
Previous Spark Submit-
spark-submit \
--conf "spark.yarn.stagingDir=/xyz/warehouse/spark" \
--queue xyz \
--class com.xyz.TestJob \
--master yarn \
--deploy-mode cluster \
--conf "spark.local.dir=/xyz/warehouse/tmp" \
/xyzpath/java-test-1.0-SNAPSHOT.jar $#
New-
spark-submit \
--queue xyz \
--class com.xyz.TestJob \
--master yarn \
--deploy-mode cluster \
--conf "spark.local.dir=/xyz/warehouse/tmp" \
/xyzpath/java-test-1.0-SNAPSHOT.jar $#

Spark-yarn ends with an error exitCode=16, how to solve that?

I am using Apache Spark 2.0.0 and Apache Hadoop 2.6.0. I am trying to run my spark application on my hadoop cluster.
I used the command lines:
bin/spark-submit --class org.JavaWordCount \
--master yarn \
--deploy-mode cluster \
--driver-memory 512m \
--queue default \
/opt/JavaWordCount.jar \
10
However, Yarn ends with an error exictCode=16:
17/01/25 11:05:49 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
17/01/25 11:05:49 INFO impl.ContainerManagementProtocolProxy: Opening proxy : hmaster:59600
17/01/25 11:05:49 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM
17/01/25 11:05:49 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 16, (reason: Shutdown hook called before final status was reported.)
17/01/25 11:05:49 INFO storage.DiskBlockManager: Shutdown hook called
I tried to solve this issue with this topic, but it doesn't give a pratical answer.
Does anyone know how to solve this isssue ?
Thanks in advance
Just Encountered this issue. Excess memory is being used by JVM. Try adding the property
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
in the yarn-site.xml of all nodemanagers and restart. It worked for me
Refer : https://issues.apache.org/jira/browse/YARN-4714

Spark on Mesos Cluster - Task Fails

I'm trying to run a Spark application in a Mesos cluster where I have one master and one slave. The slave has 8GB RAM assigned for Mesos. The master is running the Spark Mesos Dispatcher.
I use the following command to submit a Spark application (which is a streaming application).
spark-submit --master mesos://mesos-master:7077 --class com.verifone.media.ums.scheduling.spark.SparkBootstrapper --deploy-mode cluster scheduling-spark-0.5.jar
And I see the following output which shows its successfully submitted.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/09/01 12:52:38 INFO RestSubmissionClient: Submitting a request to launch an application in mesos://mesos-master:7077.
15/09/01 12:52:39 INFO RestSubmissionClient: Submission successfully created as driver-20150901072239-0002. Polling submission state...
15/09/01 12:52:39 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20150901072239-0002 in mesos://mesos-master:7077.
15/09/01 12:52:39 INFO RestSubmissionClient: State of driver driver-20150901072239-0002 is now QUEUED.
15/09/01 12:52:40 INFO RestSubmissionClient: Server responded with CreateSubmissionResponse:
{
"action" : "CreateSubmissionResponse",
"serverSparkVersion" : "1.4.1",
"submissionId" : "driver-20150901072239-0002",
"success" : true
}
However, this fails in Mesos, and when I look at the Spark Cluster UI, I see the following message.
task_id { value: "driver-20150901070957-0001" } state: TASK_FAILED message: "" slave_id { value: "20150831-082639-167881920-5050-4116-S6" } timestamp: 1.441091399975446E9 source: SOURCE_SLAVE reason: REASON_MEMORY_LIMIT 11: "\305-^E\377)N\327\277\361:\351\fm\215\312"
Seems like it is related to memory, but I'm not sure whether I have to configure something here to get this working.
UPDATE
I looked at the mesos logs in the slave, and I see the following message.
E0901 07:56:26.086618 1284 fetcher.cpp:515] Failed to run mesos-fetcher: Failed to fetch all URIs for container '33183181-e91b-4012-9e21-baa37485e755' with exit status: 256
So I thought that this could be because of the Spark Executor URL, so I modified the spark-submit to be as follows and increased memory for both driver and slave, but still I see the same error.
spark-submit \
--master mesos://mesos-master:7077 \
--class com.verifone.media.ums.scheduling.spark.SparkBootstrapper \
--deploy-mode cluster \
--driver-memory 1G \
--executor-memory 4G \
--conf spark.executor.uri=http://d3kbcqa49mib13.cloudfront.net/spark-1.4.1-bin-hadoop2.6.tgz \
scheduling-spark-0.5.jar
UPDATE 2
I went past this point by following #hartem's advice (see comments). Tasks are running now, but still, actual Spark application does not run in the cluster. When I look at the logs I see the following. After the last line, seems that Spark does not proceed any further.
15/09/01 10:33:41 INFO SparkContext: Added JAR file:/tmp/mesos/slaves/20150831-082639-167881920-5050-4116-S8/frameworks/20150831-082639-167881920-5050-4116-0004/executors/driver-20150901103327-0002/runs/47339c12-fb78-43d6-bc8a-958dd94d0ccf/spark-1.4.1-bin-hadoop2.6/../scheduling-spark-0.5.jar at http://192.172.1.31:33666/jars/scheduling-spark-0.5.jar with timestamp 1441103621639
I0901 10:33:41.728466 4375 sched.cpp:157] Version: 0.23.0
I0901 10:33:41.730764 4383 sched.cpp:254] New master detected at master#192.172.1.10:7077
I0901 10:33:41.730908 4383 sched.cpp:264] No credentials provided. Attempting to register without authentication
I had similar issue problem was slave could not find the required jar for running the class file(SparkPi). So i gave the http URL of the jar it worked, it requires jar to be placed in distributed system not on local file system.
/home/centos/spark-1.6.1-bin-hadoop2.6/bin/spark-submit \
--name SparkPiTestApp \
--class org.apache.spark.examples.SparkPi \
--master mesos://xxxxxxx:7077 \
--deploy-mode cluster \
--executor-memory 5G --total-executor-cores 30 \
http://downloads.mesosphere.com.s3.amazonaws.com/assets/spark/spark-examples_2.10-1.4.0-SNAPSHOT.jar 100
Could you please do export GLOG_v=1 before launching the slave and see if there is anything interesting in the slave log? I would also look for stdout and stderr files under the slave working directory and see if they contain any clues.

Resources