DSE spark-submit failing with SHUTDOWN_HOOK_PRIORITY, I do not have hadoop2 in cp - apache-spark

I am trying run the following java driver pgm in my local mac env, and i'm pretty sure I do not have hadoop2 in my class path and not sure why it still fails with shutdown-hook-priority error ?. Any insight will be of gr8 help, and I can run pyspark job with no exception
I am running dse 484 on my local and following is invocation
$SPARKBINFOLDER/dse spark-submit --master local[2] --class com.sample.driver.SampleLoader SampleLoader.jar $#
Following is code snippet I am using
public class SampleLoader implements Serializable {
private transient SparkConf sconf;
private SampleLoader(SparkConf sconf) {
this.sconf = sconf;
}
private void run() {
//
ClassLoader cl = ClassLoader.getSystemClassLoader();
URL[] urls = ((URLClassLoader)cl).getURLs();
for(URL url: urls){
System.out.println(url.getFile());
}
//
JavaSparkContext jsc = new JavaSparkContext(sconf);
runSparkJob(jsc);
jsc.stop();
}
private void runSparkJob(JavaSparkContext jsc) {
}
}
Following is classloader cp which I printed just before failed line of code ( JavaSparkContext jsc = new JavaSparkContext(sconf);)
########Printing the Classloader class path ........ /Users/xxxxxx/cassandra/dse484/resources/spark/conf/ /Users/xxxxxx/cassandra/dse484/lib/dse-core-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-hadoop-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-hive-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-search-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-spark-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-sqoop-4.8.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/conf/ /Users/xxxxxx/cassandra/dse484/resources/spark/lib/JavaEWAH-0.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/RoaringBitmap-0.4.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ST4-4.0.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/activation-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-actor_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-remote_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-slf4j_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ant-1.9.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ant-launcher-1.9.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/antlr-2.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/antlr-runtime-3.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/arpack_combined_all-0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-commons-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-tree-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-ipc-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-mapred-1.7.7-hadoop1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-mapred-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/bonecp-0.8.0.RELEASE.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/breeze-macros_2.10-0.11.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/breeze_2.10-0.11.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/chill-java-0.5.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/chill_2.10-0.5.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-cli-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-compress-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-httpclient-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-io-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-lang-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-lang3-3.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-logging-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-math3-3.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-net-2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/compress-lzf-1.0.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/config-1.2.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/core-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hamcrest-core-1.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/httpclient-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/httpcore-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ivy-2.4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-annotations-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-core-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-core-asl-1.9.13.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-databind-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-mapper-asl-1.9.13.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-module-scala_2.10-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jansi-1.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/javax.servlet-3.0.0.v201112011016.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/javolution-5.5.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-api-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-core-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-impl-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jdo-api-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jersey-core-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jersey-server-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jets3t-0.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-all-7.6.0.v20120127.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-continuation-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-http-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-io-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-security-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-server-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-servlet-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-util-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-util-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jline-0.9.94.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jline-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/joda-convert-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/joda-time-2.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jodd-core-3.6.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jpam-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json-20090211.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-ast_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-core_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-jackson_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jsr166e-1.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jsr305-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jta-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jtransforms-2.4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/junit-4.12.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/kryo-2.21.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/libfb303-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/libthrift-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/lz4-1.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/mail-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/mesos-0.21.1-shaded-protobuf.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-core-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-graphite-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-json-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-jvm-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/minlog-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/objenesis-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/opencsv-2.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/oro-2.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/paranamer-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-column-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-common-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-encoding-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-format-2.2.0-rc1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-generator-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-hadoop-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-hadoop-bundle-1.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-jackson-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-agent-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-model-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-schema-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/protobuf-java-2.5.0-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/py4j-0.8.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pyrolite-4.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/quasiquotes_2.10-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/reflectasm-1.07-shaded.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-compiler-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-library-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-reflect-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scalap-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/servlet-api-2.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/slf4j-api-1.7.12.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/snappy-0.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-bagel_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-cassandra-connector-java_2.10-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-cassandra-connector_2.10-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-catalyst_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-core_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-graphx_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-hive_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-launcher_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-mllib_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-network-common_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-network-shuffle_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-repl_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-sql_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-streaming_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-unsafe_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spire-macros_2.10-0.7.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spire_2.10-0.7.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stax-api-1.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stream-2.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stringtemplate-3.2.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/tachyon-0.6.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/tachyon-client-0.6.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/uncommons-maths-1.2.2a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/unused-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/velocity-1.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/xz-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-api-jdo-3.2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-core-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-rdbms-3.2.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-0.13-metastore-cassandra-connector-0.2.11.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-0.13.1-cassandra-connector-0.2.11.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-ant-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-beeline-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-cli-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-common-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-exec-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-hwi-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-jdbc-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-metastore-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-serde-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-service-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.20-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.20S-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.23-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-common-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-common-secure-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-hive-thriftserver_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/HdrHistogram-1.2.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-2.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-runtime-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/aopalliance-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-asn1-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-asn1-ber-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-i18n-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-client-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-codec-core-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-codec-standalone-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-extras-codec-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-extras-codec-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-model-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-net-mina-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-util-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/asm-5.0.3.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-beanutils-1.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-beanutils-core-1.8.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-compiler-2.6.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-configuration-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-digester-1.8.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-io-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-lang-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-logging-1.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-pool-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guice-3.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guice-multibindings-3.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-annotations-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-core-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-databind-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/janino-2.6.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/java-uuid-generator-3.1.3.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/javassist-3.18.2-GA.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/javax.inject-1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jbcrypt-0.4d.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jcl-over-slf4j-1.7.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jline-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/journalio-1.4.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jsr305-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/kmip-1.7.1e.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/log4j-1.2.13.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/mina-core-2.0.7.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/reflections-0.9.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/slf4j-api-1.7.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/stringtemplate-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/validation-api-1.1.0.Final.jar /Users/xxxxxx/cassandra/dse484/resources/dse/conf/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/conf/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/airline-0.6.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ant-1.6.5.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/automaton-1.11-8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-beanutils-1.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-beanutils-core-1.8.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-cli-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-codec-1.4.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-configuration-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-digester-1.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-el-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-httpclient-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-lang-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-logging-1.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-math-2.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-net-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/core-3.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/elephant-bird-hadoop-compat-4.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftplet-api-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftpserver-core-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftpserver-deprecated-1.0.0-M2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-core-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-examples-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-fairscheduler-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-streaming-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-test-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-tools-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hsqldb-1.8.0.10.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jackson-core-asl-1.8.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jackson-mapper-asl-1.8.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/javax.inject-1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jets3t-0.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jetty-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jetty-util-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jsp-2.1-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jsp-api-2.1-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/kfs-0.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/mina-core-2.0.0-M5.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/netty-3.9.8.Final.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/oro-2.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/servlet-api-2.5-20081211.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/servlet-api-2.5-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/xmlenc-0.52.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/cassandra-driver-core-2.1.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/cassandra-driver-dse-2.1.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/metrics-core-3.0.2.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/slf4j-api-1.7.5.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/antlr-runtime-3.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/conf/ /Users/xxxxxx/cassandra/dse484/resources/cassandra/tools/lib/stress.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/ST4-4.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/antlr-3.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-all-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-clientutil-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-thrift-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-cli-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-lang-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-lang3-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-logging-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-math3-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/compress-lzf-0.8.4.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/disruptor-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/elephant-bird-hadoop-compat-4.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/fastutil-6.5.7.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/high-scale-lib-1.0.6.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/httpclient-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/httpcore-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jackson-core-asl-1.9.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jackson-mapper-asl-1.9.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jamm-0.3.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jbcrypt-0.4d.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jna-4.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/joda-time-1.6.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/json-simple-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/libthrift-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/logback-classic-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/logback-core-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/lz4-1.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/metrics-core-2.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/netty-all-4.0.33.dse.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/reporter-config-2.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/slf4j-api-1.7.12.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/snakeyaml-1.12.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/stream-2.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/super-csv-2.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/thrift-server-0.3.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-api-jdo-3.2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-core-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-rdbms-3.2.9.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/conf/
Following is Exception:
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.spark.util.Utils$.createTempDir(Utils.scala:225)
at org.apache.spark.util.Utils$$anonfun$getOrCreateLocalRootDirsImpl$2.apply(Utils.scala:653)
at (JavaSparkContext.scala:61)
at com.walmart.gis.spark.uber.ExtractCatalogItems.run(ExtractCatalogItems.java:60)
at com.walmart.gis.spark.uber.ExtractCatalogItems.main(ExtractCatalogItems.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
Caused by: java.lang.NoSuchFieldException: SHUTDOWN_HOOK_PRIORITY
at java.lang.Class.getField(Class.java:1584)
at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:222)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:191)
at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58)
at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala)
... 32 more

I was able to work around this issue by setting scoped=provided on the spark dependency in the pom file.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope> <!--To fix the SHUTDOWN_HOOK_PRIORITY error, add this line -->
</dependency>
This forced Spark to use the Spark library included with DSE rather than the one packaged in my JAR file.

Related

How to use dbutils in a SparkListener on Databricks

Using Azure Databricks Runtime 9.1, I want to start a SparkListener and access dbutils features inside of the SparkListener.
This listener should log some information on the start of the Spark application. It should list out the file system (as a simple example) using dbutils.fs.ls.
The question How to properly access dbutils in Scala when using Databricks Connect is super close to what I'm looking to do but they are focused on dbconnect whereas I want dbutils on a SparkListener. It does point to the dbutils api library on MS Docs page where it seems to indicate that I need only specify the correct target and version of the dbutils-api package.
In the sample listener below...
If I do not include the import com.databricks.dbutils_v1.DBUtilsHolder.dbutils the jar fails to compile since I reference dbutils in the onApplicationStart method.
When I do include the import, it successfully compiles.
However, it fails to initialize the SparkListener.
I receive a NullPointerException after it tries to execute the dbutils.fs.ls command.
Any thoughts and/or guidance would be greatly appreciated!
Sample Listener Using dbutils on Application Start
package my.custom.listener
import java.util.logging.Logger
import org.apache.spark.scheduler.{SparkListener, SparkListenerApplicationStart}
import org.slf4j.{Logger, LoggerFactory}
// Crucial Import
import com.databricks.dbutils_v1.DBUtilsHolder.dbutils
class LibraryListener extends SparkListener {
private var isDatabricks = false
val log = LoggerFactory.getLogger(classOf[LibraryListener])
override def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit = {
log.info("HELLO WORLD!")
log.info(s"App Name ${applicationStart.appName}")
log.info(s"User ${applicationStart.sparkUser}")
isDatabricks = !(sys.env.get("DATABRICKS_RUNTIME_VERSION").isEmpty)
if (isDatabricks){
log.info("WE ARE USING DATABRICKS!")
// Dummy example of using dbutils
log.info(dbutils.fs.ls("dbfs:/"))
}
}
}
Error Message From Spark Listener Initialization
org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2829)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:701)
at com.databricks.backend.daemon.driver.DatabricksILoop$.$anonfun$initializeSharedDriverContext$1(DatabricksILoop.scala:347)
at com.databricks.backend.daemon.driver.ClassLoaders$.withContextClassLoader(ClassLoaders.scala:29)
at com.databricks.backend.daemon.driver.DatabricksILoop$.initializeSharedDriverContext(DatabricksILoop.scala:347)
at com.databricks.backend.daemon.driver.DatabricksILoop$.getOrCreateSharedDriverContext(DatabricksILoop.scala:277)
at com.databricks.backend.daemon.driver.DriverCorral.driverContext(DriverCorral.scala:229)
at com.databricks.backend.daemon.driver.DriverCorral.<init>(DriverCorral.scala:102)
at com.databricks.backend.daemon.driver.DriverDaemon.<init>(DriverDaemon.scala:50)
at com.databricks.backend.daemon.driver.DriverDaemon$.create(DriverDaemon.scala:287)
at com.databricks.backend.daemon.driver.DriverDaemon$.wrappedMain(DriverDaemon.scala:362)
at com.databricks.DatabricksMain.$anonfun$main$1(DatabricksMain.scala:117)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.DatabricksMain.$anonfun$withStartupProfilingData$1(DatabricksMain.scala:425)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:484)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:504)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
at com.databricks.DatabricksMain.withAttributionContext(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
at com.databricks.DatabricksMain.withAttributionTags(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:479)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:404)
at com.databricks.DatabricksMain.recordOperationWithResultTags(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:395)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:367)
at com.databricks.DatabricksMain.recordOperation(DatabricksMain.scala:85)
at com.databricks.DatabricksMain.withStartupProfilingData(DatabricksMain.scala:425)
at com.databricks.DatabricksMain.main(DatabricksMain.scala:116)
at com.databricks.backend.daemon.driver.DriverDaemon.main(DriverDaemon.scala)
Caused by: java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.databricks.dbutils_v1.DBUtilsHolder$$anon$1.invoke(DBUtilsHolder.scala:17)
at com.sun.proxy.$Proxy35.fs(Unknown Source)
at my.custom.listener.LibraryListener.<init>(LibraryListener.scala:19)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:3077)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:3066)
at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1(SparkContext.scala:2810)
at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1$adapted(SparkContext.scala:2809)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2809)
... 33 more
build.gradle
plugins {
id 'scala'
id 'java-library'
}
repositories {
mavenCentral()
}
dependencies {
// Use Scala 2.13 in our library project
implementation 'org.scala-lang:scala-library:2.12.15'
// Crucial Implementation
// https://mvnrepository.com/artifact/com.databricks/dbutils-api
implementation group: 'com.databricks', name: 'dbutils-api_2.12', version: '0.0.5'
implementation group: 'org.slf4j', name: 'slf4j-api', version: '1.7.32'
implementation group: 'org.apache.spark', name: 'spark-core_2.12', version: '3.0.0'
implementation group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.0.0'
implementation 'com.google.guava:guava:30.1.1-jre'
testImplementation 'junit:junit:4.13.2'
testImplementation 'org.scalatest:scalatest_2.12:3.2.9'
testImplementation 'org.scalatestplus:junit-4-13_2.12:3.2.2.0'
testImplementation group: 'org.slf4j', name: 'slf4j-simple', version: '1.7.32'
testRuntimeOnly 'org.scala-lang.modules:scala-xml_2.12:1.2.0'
api 'org.apache.commons:commons-math3:3.6.1'
}
Thank you for any insights!

kind: YARN_CLIENT_TOKEN, User class threw exception: java.io.FileNotFoundException: /home/username/config.properties (No such file or directory)

I am trying to run the below spark-submit command reading config file from command line args, when I run the code on local it runs fine, but when I run using yarn, it fails with the below error
Spark Submit :
time spark-submit --files /etc/hive/conf/hive-site.xml --master yarn --deploy-mode cluster --class IntegrateAD /home/username/s3ReadWrite-assembly-1.1.jar "day0" "/home/username/config.properties"
Error :
INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: User class threw exception: java.io.FileNotFoundException: /home/username/config.properties (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
Code I am running :
val propFile = new File(args(1))
val properties: Properties = new Properties()
if (propFile != null) {
// val source = Source.fromFile(propFile)
//properties.load(source.bufferedReader())
properties.load(new FileInputStream(propFile))
properties
}
else {
logger.error("properties file cannot be loaded at path " + propFile)
throw new FileNotFoundException("Properties file cannot be loaded")
}
Please help me with what maybe wrong here, in my code or my spark-submit or is it something else.
Thanks for your help in advance.

How do I use spark xml data source in .net?

Is there a way to use spark-xml (https://github.com/databricks/spark-xml) in a spark .net/c# job?
I was able to use spark-xml data source from .Net.
Here is the test program:
using Microsoft.Spark.Sql;
namespace MySparkApp
{
class Program
{
static void Main(string[] args)
{
SparkSession spark = SparkSession
.Builder()
.AppName("spark-xml-example")
.GetOrCreate();
DataFrame df = spark.Read()
.Option("rowTag", "book")
.Format("xml")
.Load("books.xml");
df.Show();
df.Select("author", "_id")
.Write()
.Format("xml")
.Option("rootTag", "books")
.Option("rowTag", "book")
.Save("newbooks.xml");
spark.Stop();
}
}
}
Checkout https://github.com/databricks/spark-xml and build an assembly jar using 'sbt assembly' command, copy the assembly jar to the dotnet project workspace.
Build project: dotnet build
Submit Spark job:
$SPARK_HOME/bin/spark-submit \
--class org.apache.spark.deploy.dotnet.DotnetRunner \
--jars scala-2.11/spark-xml-assembly-0.10.0.jar \
--master local bin/Debug/netcoreapp3.1/microsoft-spark-2.4.x-0.10.0.jar \
dotnet bin/Debug/netcoreapp3.1/sparkxml.dll

Error when using SparkJob with NamedRddSupport

Goal is to create the following on a local instance of Spark JobServer:
object foo extends SparkJob with NamedRddSupport
Question: How can I fix the following error which happens on every job:
{
"status": "ERROR",
"result": {
"message": "Ask timed out on [Actor[akka://JobServer/user/context-supervisor/439b2467-spark.jobserver.genderPrediction#884262439]] after [10000 ms]",
"errorClass": "akka.pattern.AskTimeoutException",
"stack: ["akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)", "akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)", "scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)", "scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)", "akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)", "akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)", "akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)", "akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)", "java.lang.Thread.run(Thread.java:745)"]
}
}
A more detailed error description by the Spark JobServer:
job-server[ERROR] Exception in thread "pool-100-thread-1" java.lang.AbstractMethodError: spark.jobserver.genderPrediction$.namedObjectsPrivate()Ljava/util/concurrent/atomic/AtomicReference;
job-server[ERROR] at spark.jobserver.JobManagerActor$$anonfun$spark$jobserver$JobManagerActor$$getJobFuture$4.apply(JobManagerActor.scala:248)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
job-server[ERROR] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
job-server[ERROR] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
job-server[ERROR] at java.lang.Thread.run(Thread.java:745)
In case somebody wants to see the code:
package spark.jobserver
import org.apache.spark.SparkContext._
import org.apache.spark.{SparkContext}
import com.typesafe.config.{Config, ConfigFactory}
import collection.JavaConversions._
import scala.io.Source
object genderPrediction extends SparkJob with NamedRddSupport
{
// Main function
def main(args: scala.Array[String])
{
val sc = new SparkContext()
sc.hadoopConfiguration.set("fs.tachyon.impl", "tachyon.hadoop.TFS")
val config = ConfigFactory.parseString("")
val results = runJob(sc, config)
}
def validate(sc: SparkContext, config: Config): SparkJobValidation = {SparkJobValid}
def runJob(sc: SparkContext, config: Config): Any =
{
return "ok";
}
}
Version information:
Spark is 1.5.0 - SparkJobServer is latest version
Thank you all very much in advance!
Adding more explanation to #noorul 's answer
It seems like you compiled the code with an old version of SJS and you are running it with the latest.
NamedObjects were recently added. You are getting AbstractMethodError because your server expects NamedObjects support and you didn't compile the code with that.
Also: you don't need the main method there since it won't be executed by SJS.
Ensure that your.compile and run time library versions of dependent packages are same.

Error in simple spark application

I'm running a simple spark application which does the 'word to vector'. here is my code (this is from the spark website)
import org.apache.spark._
import org.apache.spark.rdd._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.feature.{Word2Vec, Word2VecModel}
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Word2Vector")
val sc = new SparkContext(conf)
val input = sc.textFile("text8").map(line => line.split(" ").toSeq)
val word2vec = new Word2Vec()
val model = word2vec.fit(input)
val synonyms = model.findSynonyms("china", 40)
for((synonym, cosineSimilarity) <- synonyms) {
println(s"$synonym $cosineSimilarity")
}
// Save and load model
model.save(sc, "myModelPath")
}
}
when running it it gives me the following error message
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://GXYDEVVM:8020/user/hadoop/YOUR_SPARK_HOME/README.md
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1781)
at org.apache.spark.rdd.RDD.count(RDD.scala:1099)
at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:442)
at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:47)
at SimpleApp.main(SimpleApp.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
What is the problem? where this addess is coming from /user/hadoop/YOUR_SPARK_HOME/README.md
This is probably related to your default Spark configuration.
Take a look (or use grep) in the conf directory of your Spark home directory. You should find a spark-env.sh file, which could contain a reference to the strange file.
In fact, Spark is trying to load a file from HDFS (kind of a standard if you run Spark on a cluster : your input / output should be reachable by the master, and the workers slaves). If you use Spark locally you have to configure the Spark Context using setMaster method. Here is my version :
object SparkDemo {
def log[A](key:String)(job : =>A) = {
val start = System.currentTimeMillis
val output = job
println("===> %s in %s seconds"
.format(key, (System.currentTimeMillis - start) / 1000.0))
output
}
def main(args: Array[String]):Unit ={
val modelName ="w2vModel"
val sc = new SparkContext(
new SparkConf()
.setAppName("SparkDemo")
.set("spark.executor.memory", "4G")
.set("spark.driver.maxResultSize", "16G")
.setMaster("spark://192.168.1.53:7077") // ip of the spark master.
// .setMaster("local[2]") // does not work... workers loose contact with the master after 120s
)
// take a look into target folder if you are unsure how the jar is named
// onliner to compile / run : sbt package && sbt run
sc.addJar("./target/scala-2.10/sparkling_2.10-0.1.jar")
val input = sc.textFile("./text8").map(line => line.split(" ").toSeq)
val word2vec = new Word2Vec()
val model = log("compute model") { word2vec.fit(input) }
log ("save model") { model.save(sc, modelName) }
val synonyms = model.findSynonyms("china", 40)
for((synonym, cosineSimilarity) <- synonyms) {
println(s"$synonym $cosineSimilarity")
}
val model2 = log("reload model") { Word2VecModel.load(sc, modelName) }
}
}

Resources