How to troubleshoot package loading error in spark - apache-spark

I'm using spark in HDInsight with Jupyter notebook. I'm using the %%configure "magic" to import packages. Every time there is a problem with the package, spark crashes with the error:
The code failed because of a fatal error: Status 'shutting_down' not
supported by session..
or
The code failed because of a fatal error: Session 28 unexpectedly
reached final status 'dead'. See logs:
Usually the problem was with me mistyping the name of the package, so after a few attempts I could solve it. Now I'm trying to import spark-streaming-eventhubs_2.11 and I think I got the name right, but I still receive the error. I looked at all kinds of logs but still couldn't find the one which shows any relevant info. Any idea how to troubleshoot similar errors?
%%configure -f
{ "conf": {"spark.jars.packages": "com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.5" }}
Additional info: when I run
spark-shell --conf spark.jars.packages=com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.5
The shell starts fine, and downloads the package

I finally was able to find the log files which contain the error. There are two log files which could be interesting
Livy log: livy-livy-server.out
Yarn log
On my HDInsight cluster, I found the livy log by connecting to one of the Head nodes with SSH and downloading a a file at this path (this log didn't contain useful info):
/var/log/livy/livy-livy-server.out
The actual error was in the yarn log file accessible from YarnUI. In HDInsight Azure Portal, go to "Cluster dashboard" -> "Yarn", find your session (KILLED status), click on "Logs" in the table, find "Log Type: stderr", click "click here for full log".
The problem in my case was Scala version incompatibility between one of the dependencies of spark-streaming_2.11 and Livy. This is supposed to be fixed Livy 0.4. More info here

Related

Running Spark2.3 on Kubernetes with remote dependency on S3

I am running spark-submit to run on Kubernetes (Spark 2.3). My problem is that the InitContainer does not download my jar file if it's specified as an s3a:// path but does work if I put my jar on an HTTP server and use http://. The spark driver fails, of course, because it can't find my Class (and the jar file in fact is not in the image).
I have tried two approaches:
specifying the s3a path to jar as the argument to spark-submit and
using --jars to specify the jar file's location on s3a, but both fail in the same way.
edit: also, using local:///home/myuser/app.jar does not work with the same symptoms.
On a failed run (dependency on s3a), I logged into the container and found the directory /var/spark-data/spark-jars/ to be empty. The init-container logs don't indicate any type of error.
Questions:
What is the correct way to specify remote dependencies on S3A?
Is S3A not supported yet? Only http(s)?
Any suggestions on how to further debug the InitContainer to determine why the download doesn't happen?

Spark Job fails connecting to oracle in first attempt

We are running spark job which connect to oracle and fetch some data. Always attempt 0 or 1 of JDBCRDD task fails with below error. In subsequent attempt task get completed. As suggested in few portal we even tried with -Djava.security.egd=file:///dev/urandom java option but it didn't solved the problem. Can someone please help us in fixing this issue.
ava.sql.SQLRecoverableException: IO Error: Connection reset by peer, Authentication lapse 59937 ms.
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:794)
at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
Issue was with java.security.egd only. Setting it through command line i.e -Djava.security.egd=file:///dev/urandom was not working so I set it through system.setproperty with in job. After that job is no more giving SQLRecoverableException
This Exception nothing to do with Apache Spark ,"SQLRecoverableException: IO Error:" is simply the Oracle JDBC driver reporting that it's connection
to the DBMS was closed out from under it while in use. The real porblem is at
the DBMS, such as if the session died abruptly. Please check DBMS
error log and share with question.
Similer problem you can find here
https://access.redhat.com/solutions/28436
Fastest way is export spark system variable SPARK_SUBMIT_OPTS before running your job.
like this: export SPARK_SUBMIT_OPTS=-Djava.security.egd=file:dev/urandom I'm using docker, so for me full command is:
docker exec -it spark-master
bash -c "export SPARK_SUBMIT_OPTS=-Djava.security.egd=file:dev/urandom &&
/spark/bin/spark-submit --verbose --master spark://172.16.9.213:7077 /scala/sparkjob/target/scala-2.11/sparkjob-assembly-0.1.jar"
export variable
submit job

Spark execution gives "Invalid Log directory error"

I am newbie on spark and was trying to execute my jar on the cluster. while i run the jar, the job fails with error Error: invalid log directory /usr/share/spark-2.1.1/work//7/
In another thread where it was suggested to remove flag : SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true". I also changed the permission of the log directory on all the workers.
Nothing helped. can someone help me understanding why it fails.

How to use external package in Jupyter of Azure Spark

I am trying to add an external package in Jupyter of Azure Spark.
%%configure -f
{ "packages" : [ "com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.4" ] }
Its output :
Current session configs: {u'kind': 'spark', u'packages': [u'com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.4']}
But when I tried to import:
import org.apache.spark.streaming.eventhubs.EventHubsUtils
I got an error:
The code failed because of a fatal error: Invalid status code '400'
from
http://an0-o365au.zdziktedd3sexguo45qd4z4qhg.xx.internal.cloudapp.net:8998/sessions
with error payload: "Unrecognized field \"packages\" (class
com.cloudera.livy.server.interactive.CreateInteractiveRequest), not
marked as ignorable (15 known properties: \"executorCores\", \"conf\",
\"driverMemory\", \"name\", \"driverCores\", \"pyFiles\",
\"archives\", \"queue\", \"kind\", \"executorMemory\", \"files\",
\"jars\", \"proxyUser\", \"numExecutors\",
\"heartbeatTimeoutInSecond\" [truncated]])\n at [Source:
HttpInputOverHTTP#5bea54d; line: 1, column: 32] (through reference
chain:
com.cloudera.livy.server.interactive.CreateInteractiveRequest[\"packages\"])".
Some things to try: a) Make sure Spark has enough available resources
for Jupyter to create a Spark context. For instructions on how to
assign resources see http://go.microsoft.com/fwlink/?LinkId=717038 b)
Contact your cluster administrator to make sure the Spark magics
library is configured correctly.
I also tried:
%%configure
{ "conf": {"spark.jars.packages": "com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.4" }}
Got the same error.
Could someone point me a correct way to use external package in Jupyter of Azure Spark?
If you're using HDInsight 3.6, then use the following. Also, be sure to restart your kernel before executing this:
%%configure -f
{"conf":{"spark.jars.packages":"com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.4"}}
Also, ensure that your package name, version and scala version are correct. Specifically, the JAR that you're trying to use has changed names since the posting of this question. More information on what it is called now can be found here: https://github.com/Azure/azure-event-hubs-spark.

Error while running Zeppelin paragraphs in Spark on Linux cluster in Azure HdInsight

I have been following this tutorial in order to set up Zeppelin on a Spark cluster (version 1.5.2) in HDInsight, on Linux. Everything worked fine, I have managed to successfully connect to the Zeppelin notebook through the SSH tunnel. However, when I try to run any kind of paragraph, the first time I get the following error:
java.io.IOException: No FileSystem for scheme: wasb
After getting this error, if I try to rerun the paragraph, I get another error:
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
These errors occur regardless of the code I enter, even if there is no reference to the hdfs. What I'm saying is that I get the "No FileSystem" error even for a trivial scala expression, such as parallelize.
Is there a missing configuration step?
I am download the tar ball that the script that you pointed to as I type. But want I am guessing is that your zeppelin install and spark install are not complete to work with wasb. In order to get spark to work with wasb you need to add some jars to the Class path. To do this you need to add something like this to your spark-defaults.conf (the paths might be different in HDInsights, this is from HDP on IaaS)
spark.driver.extraClassPath /usr/hdp/2.3.0.0-2557/hadoop/lib/azure-storage-2.2.0.jar:/usr/hdp/2.3.0.0-2557/hadoop/lib/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/2.3.0.0-2557/hadoop/hadoop-azure-2.7.1.2.3.0.0-2557.jar
spark.executor.extraClassPath /usr/hdp/2.3.0.0-2557/hadoop/lib/azure-storage-2.2.0.jar:/usr/hdp/2.3.0.0-2557/hadoop/lib/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/2.3.0.0-2557/hadoop/hadoop-azure-2.7.1.2.3.0.0-2557.jar
Once you have spark working with wasb, or next step is make those sames jar in zeppelin class path. A good way to test your setup is make a notebook that prints your env vars and class path.
sys.env.foreach(println(_))
val cl = ClassLoader.getSystemClassLoader
cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)
Also looking at the install script, it trying to pull the zeppelin jar from wasb, you might want to change that config to somewhere else while you try some of these changes out. (zeppelin.sh)
export SPARK_YARN_JAR=wasb:///apps/zeppelin/zeppelin-spark-0.5.5-SNAPSHOT.jar
I hope this helps, if you are still have problems I have some other ideas, but would start with these first.

Resources