How do I import custom libraries in Databricks notebooks?

How do I import custom libraries in Databricks notebooks? - databricks

I uploaded a jar library on my cluster in Databricks following this tutorial, however I have been unable to import the library or use the methods of the library from the Databricks notebook. I have been unable to find forums or documentation that address this topic, so I'm unsure if it's even possible at this point.
I am able to run the jar file as a job in Databricks, I just haven't been able to import the jar library into the Notebook to run it from there.
I also tried running the jar file using the %sh magic command but received the following JNI error:
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Function0

This error generally caused because of Scala version. I would recommend upgrading Scala version and then try to import custom library.
Error: A JNI error has occurred, please check your installation and
try again Exception in thread "main" java.lang.NoClassDefFoundError:
scala/Function0
Refer - Apache Spark Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class

Related

Spark Error: illegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem

I'm a newbie with this so be a little patience please.
I'm running a Spark Job to write some stuff into hbase and I get this error:
2022-06-22 12:45:22:901 | ERROR | Caused by: java.lang.IllegalAccessError:
class org.apache.hadoop.hdfs.web.HftpFileSystem
cannot access its superinterface
org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
I read Error through remote Spark Job: java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem and since I'm using gradle instead of maven I tried to exclude the class org.apache.hadoop.hdfs.web.HftpFileSystem like this...
compileOnly ("org.apache.spark:spark-core_$scala_major:$spark_version"){
exclude group: "org.apache.hadoop", module: "hadoop-client"
}
Compilations works fine but execution fails exactly in the same way.
These are my versions:
spark_version = 2.4.7
hadoop_version = 3.1.1
All I read is about conflicts between spark and haddop so.
How can I fix this? All I have is exclude the class from spark core dependency and add the rigth version of haddop dependency.
Where I can find some reference about what versions are compatible? (To set the rigth version of haddop lib )
Can this be solved by changing something into the cluster by the infra guys?
I am not sure if I understood the issue correctly.
Thanks.

Apache Spark Error Could not find or load main class C:\spark\jars\aircompressor-0.8.jar

I am new to Apache spark & recently installed it, but I got an error:
**Error: Could not find or load main class C:\spark\jars\aircompressor-0.8.jar**
I checked that file it present there, I set up environment variable and all stuff which is necessary to successfully run spark.

I ran into this and solved it by updating my installation of java here: https://www.java.com/en/download/.
I upgraded to version 8, Update 201

Livy and Elasticsearch-Spark: Multiple ES-Hadoop versions detected

I'm trying to read from elasticsearch in a livy job using the elastisearch-spark jar. When I upload the jar to a livy client(like the example here) I get this error and I'm not sure how to parse it.
Caused by: java.lang.RuntimeException: java.lang.Error: Multiple ES-Hadoop
versions detected in the classpath; please use only one
jar:file:/tmp/tmp7d6epaqu/__livy__/elasticsearch-spark-20_2.11-6.2.2.jar
jar:file:/tmp/rsc-tmp3492512103399411501/__livy__/elasticsearch-spark-20_2.11-6.2.2.jar
I'm not sure what the temp directories are or why it's recognizing 2 jars when I'm only importing the one(If I remove the dependency from my pom it complains about Javasespark not existing). What am I doing wrong and what do I need to do to fix this?

Will zeppelin 0.6.0 work with Spark 1.4.1?

I have installed zeppelin 0.6.0 on my cluster which has spark 1.4.1 (HDP 2.3). As per the release notes I see that it supports spark 1.6 but not sure if it is backward compatible.
When I try to run sc.version in the notebook, I can see that spark job is submitted in yarn but it is failing right away with the following error in application log Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
My SPARK_HOME path is correct. So zeroing on the incompatibility issue
export MASTER=yarn-client
export SPARK_YARN_JAR=/usr/hdp/current/spark-client/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar
export SPARK_HOME=/usr/hdp/current/spark-client

Finally found solution for this. Zeppelin 0.6 works with Spark 1.4.1. This was happening due to interpreter. The logs zeppelin-zeppelin-servername.log and .out files were helpful to resolve this. I had added highcharts artifact into the spark interpreter and it was not able to find that jar file. After providing correct path, I was able to run highcharts and resolve this issue as well.

Can't run JavaFX app on another pc

I made a JavaFX app and packaged it as a .exe. When I installed it on another pc and tried to run it I get a screen with:> Error invoking method
After that I get a screen with:
Failed to launch JVM
I added my libraries to the compile & run sections. But probably there is still something wrong with them (not being packaged or something).
When I try to run the jar I get:
Caused by: java.lang.ClassNotFoundException: javax.persistence.Persistence

Make sure you bundled the jar containing that class... looking at this similar thread you can get the jar from here: http://ebr.springsource.com/repository/app/bundle/version/detail?name=com.springsource.javax.persistence&version=2.0.0

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I import custom libraries in Databricks notebooks? - databricks

Related

Spark Error: illegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem

Apache Spark Error Could not find or load main class C:\spark\jars\aircompressor-0.8.jar

Livy and Elasticsearch-Spark: Multiple ES-Hadoop versions detected

Will zeppelin 0.6.0 work with Spark 1.4.1?

Can't run JavaFX app on another pc

Categories

Resources