Error when using SparkJob with NamedRddSupport - apache-spark

Goal is to create the following on a local instance of Spark JobServer:
object foo extends SparkJob with NamedRddSupport
Question: How can I fix the following error which happens on every job:
{
"status": "ERROR",
"result": {
"message": "Ask timed out on [Actor[akka://JobServer/user/context-supervisor/439b2467-spark.jobserver.genderPrediction#884262439]] after [10000 ms]",
"errorClass": "akka.pattern.AskTimeoutException",
"stack: ["akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)", "akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)", "scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)", "scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)", "akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)", "akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)", "akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)", "akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)", "java.lang.Thread.run(Thread.java:745)"]
}
}
A more detailed error description by the Spark JobServer:
job-server[ERROR] Exception in thread "pool-100-thread-1" java.lang.AbstractMethodError: spark.jobserver.genderPrediction$.namedObjectsPrivate()Ljava/util/concurrent/atomic/AtomicReference;
job-server[ERROR] at spark.jobserver.JobManagerActor$$anonfun$spark$jobserver$JobManagerActor$$getJobFuture$4.apply(JobManagerActor.scala:248)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
job-server[ERROR] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
job-server[ERROR] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
job-server[ERROR] at java.lang.Thread.run(Thread.java:745)
In case somebody wants to see the code:
package spark.jobserver
import org.apache.spark.SparkContext._
import org.apache.spark.{SparkContext}
import com.typesafe.config.{Config, ConfigFactory}
import collection.JavaConversions._
import scala.io.Source
object genderPrediction extends SparkJob with NamedRddSupport
{
// Main function
def main(args: scala.Array[String])
{
val sc = new SparkContext()
sc.hadoopConfiguration.set("fs.tachyon.impl", "tachyon.hadoop.TFS")
val config = ConfigFactory.parseString("")
val results = runJob(sc, config)
}
def validate(sc: SparkContext, config: Config): SparkJobValidation = {SparkJobValid}
def runJob(sc: SparkContext, config: Config): Any =
{
return "ok";
}
}
Version information:
Spark is 1.5.0 - SparkJobServer is latest version
Thank you all very much in advance!

Adding more explanation to #noorul 's answer
It seems like you compiled the code with an old version of SJS and you are running it with the latest.
NamedObjects were recently added. You are getting AbstractMethodError because your server expects NamedObjects support and you didn't compile the code with that.
Also: you don't need the main method there since it won't be executed by SJS.

Ensure that your.compile and run time library versions of dependent packages are same.

Related

How to use dbutils in a SparkListener on Databricks

Using Azure Databricks Runtime 9.1, I want to start a SparkListener and access dbutils features inside of the SparkListener.
This listener should log some information on the start of the Spark application. It should list out the file system (as a simple example) using dbutils.fs.ls.
The question How to properly access dbutils in Scala when using Databricks Connect is super close to what I'm looking to do but they are focused on dbconnect whereas I want dbutils on a SparkListener. It does point to the dbutils api library on MS Docs page where it seems to indicate that I need only specify the correct target and version of the dbutils-api package.
In the sample listener below...
If I do not include the import com.databricks.dbutils_v1.DBUtilsHolder.dbutils the jar fails to compile since I reference dbutils in the onApplicationStart method.
When I do include the import, it successfully compiles.
However, it fails to initialize the SparkListener.
I receive a NullPointerException after it tries to execute the dbutils.fs.ls command.
Any thoughts and/or guidance would be greatly appreciated!
Sample Listener Using dbutils on Application Start
package my.custom.listener
import java.util.logging.Logger
import org.apache.spark.scheduler.{SparkListener, SparkListenerApplicationStart}
import org.slf4j.{Logger, LoggerFactory}
// Crucial Import
import com.databricks.dbutils_v1.DBUtilsHolder.dbutils
class LibraryListener extends SparkListener {
private var isDatabricks = false
val log = LoggerFactory.getLogger(classOf[LibraryListener])
override def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit = {
log.info("HELLO WORLD!")
log.info(s"App Name ${applicationStart.appName}")
log.info(s"User ${applicationStart.sparkUser}")
isDatabricks = !(sys.env.get("DATABRICKS_RUNTIME_VERSION").isEmpty)
if (isDatabricks){
log.info("WE ARE USING DATABRICKS!")
// Dummy example of using dbutils
log.info(dbutils.fs.ls("dbfs:/"))
}
}
}
Error Message From Spark Listener Initialization
org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2829)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:701)
at com.databricks.backend.daemon.driver.DatabricksILoop$.$anonfun$initializeSharedDriverContext$1(DatabricksILoop.scala:347)
at com.databricks.backend.daemon.driver.ClassLoaders$.withContextClassLoader(ClassLoaders.scala:29)
at com.databricks.backend.daemon.driver.DatabricksILoop$.initializeSharedDriverContext(DatabricksILoop.scala:347)
at com.databricks.backend.daemon.driver.DatabricksILoop$.getOrCreateSharedDriverContext(DatabricksILoop.scala:277)
at com.databricks.backend.daemon.driver.DriverCorral.driverContext(DriverCorral.scala:229)
at com.databricks.backend.daemon.driver.DriverCorral.<init>(DriverCorral.scala:102)
at com.databricks.backend.daemon.driver.DriverDaemon.<init>(DriverDaemon.scala:50)
at com.databricks.backend.daemon.driver.DriverDaemon$.create(DriverDaemon.scala:287)
at com.databricks.backend.daemon.driver.DriverDaemon$.wrappedMain(DriverDaemon.scala:362)
at com.databricks.DatabricksMain.$anonfun$main$1(DatabricksMain.scala:117)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.DatabricksMain.$anonfun$withStartupProfilingData$1(DatabricksMain.scala:425)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:484)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:504)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
at com.databricks.DatabricksMain.withAttributionContext(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
at com.databricks.DatabricksMain.withAttributionTags(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:479)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:404)
at com.databricks.DatabricksMain.recordOperationWithResultTags(DatabricksMain.scala:85)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:395)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:367)
at com.databricks.DatabricksMain.recordOperation(DatabricksMain.scala:85)
at com.databricks.DatabricksMain.withStartupProfilingData(DatabricksMain.scala:425)
at com.databricks.DatabricksMain.main(DatabricksMain.scala:116)
at com.databricks.backend.daemon.driver.DriverDaemon.main(DriverDaemon.scala)
Caused by: java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.databricks.dbutils_v1.DBUtilsHolder$$anon$1.invoke(DBUtilsHolder.scala:17)
at com.sun.proxy.$Proxy35.fs(Unknown Source)
at my.custom.listener.LibraryListener.<init>(LibraryListener.scala:19)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:3077)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:3066)
at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1(SparkContext.scala:2810)
at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1$adapted(SparkContext.scala:2809)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2809)
... 33 more
build.gradle
plugins {
id 'scala'
id 'java-library'
}
repositories {
mavenCentral()
}
dependencies {
// Use Scala 2.13 in our library project
implementation 'org.scala-lang:scala-library:2.12.15'
// Crucial Implementation
// https://mvnrepository.com/artifact/com.databricks/dbutils-api
implementation group: 'com.databricks', name: 'dbutils-api_2.12', version: '0.0.5'
implementation group: 'org.slf4j', name: 'slf4j-api', version: '1.7.32'
implementation group: 'org.apache.spark', name: 'spark-core_2.12', version: '3.0.0'
implementation group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.0.0'
implementation 'com.google.guava:guava:30.1.1-jre'
testImplementation 'junit:junit:4.13.2'
testImplementation 'org.scalatest:scalatest_2.12:3.2.9'
testImplementation 'org.scalatestplus:junit-4-13_2.12:3.2.2.0'
testImplementation group: 'org.slf4j', name: 'slf4j-simple', version: '1.7.32'
testRuntimeOnly 'org.scala-lang.modules:scala-xml_2.12:1.2.0'
api 'org.apache.commons:commons-math3:3.6.1'
}
Thank you for any insights!

Difficulties in using a Gcloud Composer DAG to run a Spark job

I'm playing around with Gcloud Composer, trying to create a DAG that creates a DataProc cluster, runs a simple Spark job, then tears down the cluster. I am trying to run the Spark PI example job.
I understand that when calling DataProcSparkOperator I can choose only to define either the main_jar or the main_class property. When I define main_class, the job fails with the error:
java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
When I choose to define the main_jar property, the job fails with the error:
Error: No main class set in JAR; please specify one with --class
Run with --help for usage help or --verbose for debug output
I'm at a bit of a loss as to how to resolve this, as I am kinda new to both Spark and DataProc.
My DAG:
import datetime as dt
from airflow import DAG, models
from airflow.contrib.operators import dataproc_operator as dpo
from airflow.utils import trigger_rule
MAIN_JAR = 'file:///usr/lib/spark/examples/jars/spark-examples.jar'
MAIN_CLASS = 'org.apache.spark.examples.SparkPi'
CLUSTER_NAME = 'quickspark-cluster-{{ ds_nodash }}'
yesterday = dt.datetime.combine(
dt.datetime.today() - dt.timedelta(1),
dt.datetime.min.time())
default_dag_args = {
'start_date': yesterday,
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': dt.timedelta(seconds=30),
'project_id': models.Variable.get('gcp_project')
}
with DAG('dataproc_spark_submit', schedule_interval='0 17 * * *',
default_args=default_dag_args) as dag:
create_dataproc_cluster = dpo.DataprocClusterCreateOperator(
project_id = default_dag_args['project_id'],
task_id = 'create_dataproc_cluster',
cluster_name = CLUSTER_NAME,
num_workers = 2,
zone = models.Variable.get('gce_zone')
)
run_spark_job = dpo.DataProcSparkOperator(
task_id = 'run_spark_job',
#main_jar = MAIN_JAR,
main_class = MAIN_CLASS,
cluster_name = CLUSTER_NAME
)
delete_dataproc_cluster = dpo.DataprocClusterDeleteOperator(
project_id = default_dag_args['project_id'],
task_id = 'delete_dataproc_cluster',
cluster_name = CLUSTER_NAME,
trigger_rule = trigger_rule.TriggerRule.ALL_DONE
)
create_dataproc_cluster >> run_spark_job >> delete_dataproc_cluster
I compared it with a successful job using the CLI and saw that, even when the class was populating the Main class or jar field, the path to the Jar was specified in Jar files:
Checking the operator I noticed there is also a dataproc_spark_jars parameter which is not mutually exclusive to main_class:
run_spark_job = dpo.DataProcSparkOperator(
task_id = 'run_spark_job',
dataproc_spark_jars = [MAIN_JAR],
main_class = MAIN_CLASS,
cluster_name = CLUSTER_NAME
)
Adding it did the trick:

DSE spark-submit failing with SHUTDOWN_HOOK_PRIORITY, I do not have hadoop2 in cp

I am trying run the following java driver pgm in my local mac env, and i'm pretty sure I do not have hadoop2 in my class path and not sure why it still fails with shutdown-hook-priority error ?. Any insight will be of gr8 help, and I can run pyspark job with no exception
I am running dse 484 on my local and following is invocation
$SPARKBINFOLDER/dse spark-submit --master local[2] --class com.sample.driver.SampleLoader SampleLoader.jar $#
Following is code snippet I am using
public class SampleLoader implements Serializable {
private transient SparkConf sconf;
private SampleLoader(SparkConf sconf) {
this.sconf = sconf;
}
private void run() {
//
ClassLoader cl = ClassLoader.getSystemClassLoader();
URL[] urls = ((URLClassLoader)cl).getURLs();
for(URL url: urls){
System.out.println(url.getFile());
}
//
JavaSparkContext jsc = new JavaSparkContext(sconf);
runSparkJob(jsc);
jsc.stop();
}
private void runSparkJob(JavaSparkContext jsc) {
}
}
Following is classloader cp which I printed just before failed line of code ( JavaSparkContext jsc = new JavaSparkContext(sconf);)
########Printing the Classloader class path ........ /Users/xxxxxx/cassandra/dse484/resources/spark/conf/ /Users/xxxxxx/cassandra/dse484/lib/dse-core-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-hadoop-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-hive-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-search-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-spark-4.8.4.jar /Users/xxxxxx/cassandra/dse484/lib/dse-sqoop-4.8.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/conf/ /Users/xxxxxx/cassandra/dse484/resources/spark/lib/JavaEWAH-0.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/RoaringBitmap-0.4.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ST4-4.0.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/activation-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-actor_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-remote_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/akka-slf4j_2.10-2.3.4-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ant-1.9.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ant-launcher-1.9.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/antlr-2.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/antlr-runtime-3.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/arpack_combined_all-0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-commons-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/asm-tree-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-ipc-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-mapred-1.7.7-hadoop1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/avro-mapred-1.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/bonecp-0.8.0.RELEASE.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/breeze-macros_2.10-0.11.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/breeze_2.10-0.11.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/chill-java-0.5.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/chill_2.10-0.5.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-cli-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-compress-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-httpclient-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-io-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-lang-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-lang3-3.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-logging-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-math3-3.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/commons-net-2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/compress-lzf-1.0.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/config-1.2.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/core-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hamcrest-core-1.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/httpclient-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/httpcore-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/ivy-2.4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-annotations-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-core-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-core-asl-1.9.13.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-databind-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-mapper-asl-1.9.13.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jackson-module-scala_2.10-2.3.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jansi-1.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/javax.servlet-3.0.0.v201112011016.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/javolution-5.5.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-api-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-core-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jaxb-impl-2.2.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jdo-api-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jersey-core-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jersey-server-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jets3t-0.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-all-7.6.0.v20120127.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-continuation-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-http-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-io-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-security-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-server-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-servlet-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-util-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jetty-util-8.1.14.v20131031.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jline-0.9.94.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jline-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/joda-convert-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/joda-time-2.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jodd-core-3.6.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jpam-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json-20090211.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-ast_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-core_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/json4s-jackson_2.10-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jsr166e-1.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jsr305-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jta-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/jtransforms-2.4.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/junit-4.12.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/kryo-2.21.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/libfb303-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/libthrift-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/lz4-1.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/mail-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/mesos-0.21.1-shaded-protobuf.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-core-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-graphite-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-json-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/metrics-jvm-3.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/minlog-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/objenesis-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/opencsv-2.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/oro-2.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/paranamer-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-column-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-common-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-encoding-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-format-2.2.0-rc1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-generator-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-hadoop-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-hadoop-bundle-1.3.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/parquet-jackson-1.6.0rc3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-agent-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-model-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pmml-schema-1.1.15.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/protobuf-java-2.5.0-spark.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/py4j-0.8.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/pyrolite-4.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/quasiquotes_2.10-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/reflectasm-1.07-shaded.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-compiler-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-library-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scala-reflect-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/scalap-2.10.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/servlet-api-2.5.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/slf4j-api-1.7.12.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/snappy-0.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-bagel_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-cassandra-connector-java_2.10-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-cassandra-connector_2.10-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-catalyst_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-core_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-graphx_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-hive_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-launcher_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-mllib_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-network-common_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-network-shuffle_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-repl_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-sql_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-streaming_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-unsafe_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spire-macros_2.10-0.7.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spire_2.10-0.7.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stax-api-1.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stream-2.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/stringtemplate-3.2.1.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/tachyon-0.6.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/tachyon-client-0.6.4.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/uncommons-maths-1.2.2a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/unused-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/velocity-1.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/xz-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-api-jdo-3.2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-core-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-rdbms-3.2.9.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-0.13-metastore-cassandra-connector-0.2.11.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-0.13.1-cassandra-connector-0.2.11.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-ant-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-beeline-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-cli-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-common-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-exec-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-hwi-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-jdbc-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-metastore-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-serde-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-service-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.20-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.20S-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-0.23-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-common-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/hive-shims-common-secure-0.13.1a.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/spark-hive-thriftserver_2.10-1.4.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/HdrHistogram-1.2.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-2.7.7.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/antlr-runtime-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/aopalliance-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-asn1-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-asn1-ber-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-i18n-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-client-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-codec-core-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-codec-standalone-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-extras-codec-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-extras-codec-api-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-model-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-ldap-net-mina-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/api-util-1.0.0-M24.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/asm-5.0.3.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-beanutils-1.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-beanutils-core-1.8.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-compiler-2.6.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-configuration-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-digester-1.8.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-io-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-lang-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-logging-1.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/commons-pool-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guice-3.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/guice-multibindings-3.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-annotations-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-core-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jackson-databind-2.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/janino-2.6.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/java-uuid-generator-3.1.3.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/javassist-3.18.2-GA.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/javax.inject-1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jbcrypt-0.4d.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jcl-over-slf4j-1.7.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jline-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/journalio-1.4.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/jsr305-2.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/kmip-1.7.1e.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/log4j-1.2.13.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/mina-core-2.0.7.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/reflections-0.9.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/slf4j-api-1.7.10.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/stringtemplate-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/dse/lib/validation-api-1.1.0.Final.jar /Users/xxxxxx/cassandra/dse484/resources/dse/conf/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/conf/ /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/airline-0.6.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ant-1.6.5.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/automaton-1.11-8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-beanutils-1.7.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-beanutils-core-1.8.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-cli-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-codec-1.4.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-collections-3.2.2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-configuration-1.6.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-digester-1.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-el-1.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-httpclient-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-lang-2.4.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-logging-1.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-math-2.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/commons-net-1.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/core-3.1.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/elephant-bird-hadoop-compat-4.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftplet-api-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftpserver-core-1.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/ftpserver-deprecated-1.0.0-M2.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-core-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-examples-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-fairscheduler-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-streaming-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-test-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hadoop-tools-1.0.4.18.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/hsqldb-1.8.0.10.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jackson-core-asl-1.8.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jackson-mapper-asl-1.8.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/javax.inject-1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jets3t-0.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jetty-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jetty-util-6.1.26.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jsp-2.1-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/jsp-api-2.1-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/kfs-0.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/mina-core-2.0.0-M5.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/netty-3.9.8.Final.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/oro-2.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/servlet-api-2.5-20081211.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/servlet-api-2.5-6.1.14.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/lib/xmlenc-0.52.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/cassandra-driver-core-2.1.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/cassandra-driver-dse-2.1.7.1.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/metrics-core-3.0.2.jar /Users/xxxxxx/cassandra/dse484/resources/driver/lib/slf4j-api-1.7.5.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/antlr-runtime-3.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/conf/ /Users/xxxxxx/cassandra/dse484/resources/cassandra/tools/lib/stress.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/ST4-4.0.8.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/antlr-3.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-all-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-clientutil-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/cassandra-thrift-2.1.12.1046.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-cli-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-codec-1.9.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-lang-2.6.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-lang3-3.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-logging-1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/commons-math3-3.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/compress-lzf-0.8.4.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/disruptor-3.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/elephant-bird-hadoop-compat-4.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/fastutil-6.5.7.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/guava-16.0.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/high-scale-lib-1.0.6.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/httpclient-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/httpcore-4.4.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jackson-core-asl-1.9.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jackson-mapper-asl-1.9.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jamm-0.3.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jbcrypt-0.4d.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/jna-4.0.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/joda-time-1.6.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/json-simple-1.1.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/libthrift-0.9.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/logback-classic-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/logback-core-1.1.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/lz4-1.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/metrics-core-2.2.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/netty-all-4.0.33.dse.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/reporter-config-2.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/slf4j-api-1.7.12.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/snakeyaml-1.12.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/snappy-java-1.0.5.3.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/stream-2.5.2.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/super-csv-2.1.0.jar /Users/xxxxxx/cassandra/dse484/resources/cassandra/lib/thrift-server-0.3.7.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-api-jdo-3.2.6.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-core-3.2.10.jar /Users/xxxxxx/cassandra/dse484/resources/spark/lib/datanucleus-rdbms-3.2.9.jar /Users/xxxxxx/cassandra/dse484/resources/hadoop/conf/
Following is Exception:
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.spark.util.Utils$.createTempDir(Utils.scala:225)
at org.apache.spark.util.Utils$$anonfun$getOrCreateLocalRootDirsImpl$2.apply(Utils.scala:653)
at (JavaSparkContext.scala:61)
at com.walmart.gis.spark.uber.ExtractCatalogItems.run(ExtractCatalogItems.java:60)
at com.walmart.gis.spark.uber.ExtractCatalogItems.main(ExtractCatalogItems.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
Caused by: java.lang.NoSuchFieldException: SHUTDOWN_HOOK_PRIORITY
at java.lang.Class.getField(Class.java:1584)
at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:222)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50)
at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48)
at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:191)
at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58)
at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala)
... 32 more
I was able to work around this issue by setting scoped=provided on the spark dependency in the pom file.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope> <!--To fix the SHUTDOWN_HOOK_PRIORITY error, add this line -->
</dependency>
This forced Spark to use the Spark library included with DSE rather than the one packaged in my JAR file.

Error in simple spark application

I'm running a simple spark application which does the 'word to vector'. here is my code (this is from the spark website)
import org.apache.spark._
import org.apache.spark.rdd._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.feature.{Word2Vec, Word2VecModel}
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Word2Vector")
val sc = new SparkContext(conf)
val input = sc.textFile("text8").map(line => line.split(" ").toSeq)
val word2vec = new Word2Vec()
val model = word2vec.fit(input)
val synonyms = model.findSynonyms("china", 40)
for((synonym, cosineSimilarity) <- synonyms) {
println(s"$synonym $cosineSimilarity")
}
// Save and load model
model.save(sc, "myModelPath")
}
}
when running it it gives me the following error message
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://GXYDEVVM:8020/user/hadoop/YOUR_SPARK_HOME/README.md
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1781)
at org.apache.spark.rdd.RDD.count(RDD.scala:1099)
at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:442)
at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:47)
at SimpleApp.main(SimpleApp.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
What is the problem? where this addess is coming from /user/hadoop/YOUR_SPARK_HOME/README.md
This is probably related to your default Spark configuration.
Take a look (or use grep) in the conf directory of your Spark home directory. You should find a spark-env.sh file, which could contain a reference to the strange file.
In fact, Spark is trying to load a file from HDFS (kind of a standard if you run Spark on a cluster : your input / output should be reachable by the master, and the workers slaves). If you use Spark locally you have to configure the Spark Context using setMaster method. Here is my version :
object SparkDemo {
def log[A](key:String)(job : =>A) = {
val start = System.currentTimeMillis
val output = job
println("===> %s in %s seconds"
.format(key, (System.currentTimeMillis - start) / 1000.0))
output
}
def main(args: Array[String]):Unit ={
val modelName ="w2vModel"
val sc = new SparkContext(
new SparkConf()
.setAppName("SparkDemo")
.set("spark.executor.memory", "4G")
.set("spark.driver.maxResultSize", "16G")
.setMaster("spark://192.168.1.53:7077") // ip of the spark master.
// .setMaster("local[2]") // does not work... workers loose contact with the master after 120s
)
// take a look into target folder if you are unsure how the jar is named
// onliner to compile / run : sbt package && sbt run
sc.addJar("./target/scala-2.10/sparkling_2.10-0.1.jar")
val input = sc.textFile("./text8").map(line => line.split(" ").toSeq)
val word2vec = new Word2Vec()
val model = log("compute model") { word2vec.fit(input) }
log ("save model") { model.save(sc, modelName) }
val synonyms = model.findSynonyms("china", 40)
for((synonym, cosineSimilarity) <- synonyms) {
println(s"$synonym $cosineSimilarity")
}
val model2 = log("reload model") { Word2VecModel.load(sc, modelName) }
}
}

Error when running job that queries against Cassandra via Spark SQL through Spark Jobserver

So I'm trying to run job that simply runs a query against cassandra using spark-sql, the job is submitted fine and the job starts fine. This code works when it is not being run through spark jobserver (when simply using spark submit). Could someone tell my what is wrong with my job code or configuration files that is causing the error below?
{
"status": "ERROR",
"ERROR": {
"errorClass": "java.util.concurrent.ExecutionException",
"cause": "Failed to open native connection to Cassandra at {127.0.1.1}:9042",
"stack": ["com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSes
sion(CassandraConnector.scala:155)", "com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scal
a:141)", "com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:141)", "com.datastax.spark
.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)", "com.datastax.spark.connector.cql.RefCountedCache
.acquire(RefCountedCache.scala:56)", "com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:73)
", "com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:101)", "com.datastax.spark.connecto
r.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:112)", "com.datastax.spark.connector.cql.Schema$.fromCassandra(Sch
ema.scala:243)", "org.apache.spark.sql.cassandra.CassandraCatalog$$anon$1.load(CassandraCatalog.scala:22)", "org.apache.spark.sql.
cassandra.CassandraCatalog$$anon$1.load(CassandraCatalog.scala:19)", "com.google.common.cache.LocalCache$LoadingValueReference.loa
dFuture(LocalCache.java:3599)", "com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)", "com.google.common.ca
che.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)", "com.google.common.cache.LocalCache$Segment.get(LocalCache.java:225
7)", "com.google.common.cache.LocalCache.get(LocalCache.java:4000)", "com.google.common.cache.LocalCache.getOrLoad(LocalCache.java
:4004)", "com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)", "org.apache.spark.sql.cassandra.Cassand
raCatalog.lookupRelation(CassandraCatalog.scala:28)", "org.apache.spark.sql.cassandra.CassandraSQLContext$$anon$2.org$apache$spark
$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(CassandraSQLContext.scala:218)", "org.apache.spark.sql.catalyst.analy
sis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)", "org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$
anonfun$lookupRelation$3.apply(Catalog.scala:161)", "scala.Option.getOrElse(Option.scala:120)", "org.apache.spark.sql.catalyst.ana
lysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)", "org.apache.spark.sql.cassandra.CassandraSQLContext$$anon$2.lookup
Relation(CassandraSQLContext.scala:218)", "org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.sca
la:174)", "org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:186)", "or
g.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:181)", "org.apache.spar
k.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:188)", "org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.appl
y(TreeNode.scala:188)", "org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51)", "org.apache.spark.sql.
catalyst.trees.TreeNode.transformDown(TreeNode.scala:187)", "org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNod
e.scala:208)", "scala.collection.Iterator$$anon$11.next(Iterator.scala:328)", "scala.collection.Iterator$class.foreach(Iterator.sc
ala:727)", "scala.collection.AbstractIterator.foreach(Iterator.scala:1157)", "scala.collection.generic.Growable$class.$plus$plus$e
q(Growable.scala:48)", "scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)", "scala.collection.mutable.Arra
yBuffer.$plus$plus$eq(ArrayBuffer.scala:47)", "scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)", "scala.colle
ction.AbstractIterator.to(Iterator.scala:1157)", "scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)", "sc
ala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)", "scala.collection.TraversableOnce$class.toArray(TraversableOnce.sc
ala:252)", "scala.collection.AbstractIterator.toArray(Iterator.scala:1157)", "org.apache.spark.sql.catalyst.trees.TreeNode.transfo
rmChildrenDown(TreeNode.scala:238)", "org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:193)", "org.apache
.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:178)", "org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelatio
ns$.apply(Analyzer.scala:181)", "org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:171)", "or
g.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)", "org.apache.spark.
sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)", "scala.collection.LinearSeqOptimi
zed$class.foldLeft(LinearSeqOptimized.scala:111)", "scala.collection.immutable.List.foldLeft(List.scala:84)", "org.apache.spark.sq
l.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)", "org.apache.spark.sql.catalyst.rules.RuleExecutor$$a
nonfun$apply$1.apply(RuleExecutor.scala:51)", "scala.collection.immutable.List.foreach(List.scala:318)", "org.apache.spark.sql.cat
alyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)", "org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLCon
text.scala:1082)", "org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1082)", "org.apache.spark.sql.SQLCont
ext$QueryExecution.assertAnalyzed(SQLContext.scala:1080)", "org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)", "org.apac
he.spark.sql.cassandra.CassandraSQLContext.cassandraSql(CassandraSQLContext.scala:211)", "org.apache.spark.sql.cassandra.Cassandra
SQLContext.sql(CassandraSQLContext.scala:214)", "CassSparkTest$.runJob(CassSparkTest.scala:23)", "CassSparkTest$.runJob(CassSparkT
est.scala:9)", "spark.jobserver.JobManagerActor$$anonfun$spark$jobserver$JobManagerActor$$getJobFuture$4.apply(JobManagerActor.sca
la:235)", "scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)", "scala.concurrent.impl.Future$P
romiseCompletingRunnable.run(Future.scala:24)", "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)",
"java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)", "java.lang.Thread.run(Thread.java:745)"],
"causingClass": "java.io.IOException",
"message": "java.io.IOException: Failed to open native connection to Cassandra at {127.0.1.1}:9042"
}
}
Here is the job I am running:
import org.apache.spark.{SparkContext, SparkConf}
import com.datastax.spark.connector._
import org.apache.spark.sql.cassandra.CassandraSQLContext
import org.apache.spark.sql._
import spark.jobserver._
import com.typesafe.config.Config
import com.typesafe.config.ConfigFactory
object CassSparkTest extends SparkJob {
def main(args: Array[String]) {
val sc = new SparkContext("spark://192.168.10.11:7077", "test")
val config = ConfigFactory.parseString("")
val results = runJob(sc, config)
println("Results:" + results)
}
override def validate(sc:SparkContext, config: Config): SparkJobValidation = {
SparkJobValid
}
override def runJob(sc:SparkContext, config: Config): Any = {
val sqlC = new CassandraSQLContext(sc)
val df = sqlC.sql(config.getString("input.sql"))
df.collect()
}
}
and here is my configuration file for spark-jobserver
# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
spark {
# spark.master will be passed to each job's JobContext
master = "spark://192.168.10.11:7077"
# master = "mesos://vm28-hulk-pub:5050"
# master = "yarn-client"
# Default # of CPUs for jobs to use for Spark standalone cluster
job-number-cpus = 1
jobserver {
port = 2020
jar-store-rootdir = /tmp/jobserver/jars
jobdao = spark.jobserver.io.JobFileDAO
filedao {
rootdir = /tmp/spark-job-server/filedao/data
}
}
# predefined Spark contexts
# contexts {
# my-low-latency-context {
# num-cpu-cores = 1 # Number of cores to allocate. Required.
# memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, 1G, etc.
# }
# # define additional contexts here
# }
# universal context configuration. These settings can be overridden, see README.md
context-settings {
num-cpu-cores = 1 # Number of cores to allocate. Required.
memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc.
# in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave)
# spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz"
spark-cassandra-connection-host="127.0.0.1"
# uris of jars to be loaded into the classpath for this context. Uris is a string list, or a string separated by commas ','
# dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"]
dependent-jar-uris = ["file:///home/vagrant/lib/spark-cassandra-connector-assembly-1.3.0-M2-SNAPSHOT.jar"]
# If you wish to pass any settings directly to the sparkConf as-is, add them here in passthrough,
# such as hadoop connection settings that don't use the "spark." prefix
passthrough {
#es.nodes = "192.1.1.1"
}
}
# This needs to match SPARK_HOME for cluster SparkContexts to be created successfully
# home = "/home/spark/spark"
}
# Note that you can use this file to define settings not only for job server,
# but for your Spark jobs as well. Spark job configuration merges with this configuration file as defaults.
#vicg, first you need spark.cassandra.connection.host -- periods not dashes. Also note in the error how the IP is "127.0.1.1", not the one in the config. You can also pass the IP when you create a context, like:
curl -X POST 'localhost:8090/contexts/my-context?spark.cassandra.connection.host=127.0.0.1'
If the above don't work, try the following PR:
https://github.com/spark-jobserver/spark-jobserver/pull/164

Resources