Starting up Spark History Server to write to minIO - apache-spark

I'm trying to get Spark History Server to run on my cluster that is running on Kubernetes, and I'd like the logs to get written to minIO. I'm also using minIO as storage of the input and output of my spark-submit jobs, which is working already.
Currectly working spark-submit jobs
My working spark-submit job looks something like the following:
spark-submit \
--conf spark.hadoop.fs.s3a.access.key=XXXX \
--conf spark.hadoop.fs.s3a.secret.key=XXXX \
--conf spark.hadoop.fs.s3a.endpoint=https://someIpv4 \
--conf spark.hadoop.fs.s3a.connection.ssl.enabled=true \
--conf spark.hadoop.fs.s3a.path.style.access=true \
--conf spark.hadoop.fs.default.name="s3a:///" \
--conf spark.driver.extraJavaOptions="-Djavax.net.ssl.trustStore=XXXX -Djavax.net.ssl.trustStorePassword=XXXX \
--conf spark.executor.extraJavaOptions="-Djavax.net.ssl.trustStore=XXXX -Djavax.net.ssl.trustStorePassword=XXXX \
...
As you can see, I'm using SSL to connect to minIO and to read/write files.
What am I trying
I'm trying to spin up the history server with minIO as storage without using SSL.
To start up the history server, I'm using the already present start-history-server.sh script with some configs to define the log storage location with the ./start-history-server.sh --properties-file my_conf_file command. my_conf_file looks like this:
spark.eventLog.enabled=true
spark.eventLog.dir=s3a://myBucket/spark-events
spark.history.fs.logDirectory=s3a://myBucket/spark-events
spark.hadoop.fs.s3a.access.key=XXXX
spark.hadoop.fs.s3a.secret.key=XXXX
spark.hadoop.fs.s3a.endpoint=http://someIpv4
spark.hadoop.fs.s3a.path.style.access=true
spark.hadoop.fs.s3a.connection.ssl.enabled=false
So you see I'm not adding any SSL parameters. But when I run ./start-history-server.sh --properties-file my_conf_file, I'm getting this error:
INFO AmazonHttpClient: Unable to execute HTTP request: Connection refused (Connection refused)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:121)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:326)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:835)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:117)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:86)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:296)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
What have I tried/found on the internet
This person had a very similar problem to mine, but it seems like they solved it using spark.hadoop.fs.s3a.path.style.access, which I'm already using
I was able to spin up History server using the local filesystem, so that seems to be working correctly
I have seen people, like in this post, using the spark.hadoop.fs.s3a.impl key with org.apache.hadoop.fs.s3a.S3AFileSystem as value. When I do this, however, It seems like this class doesn't exist within my AWS jars.
I have the following AWS jars at my disposal: aws-java-sdk-1.7.4.jar and hadoop-aws-2.7.3.jar
Since my spark-submit jobs are running fine, reading/writing away files to minIO, and I'm not supplying that spark.hadoop.fs.s3a.impl parameter in them I would think that that parameter is not needed?
Does anyone have an idea of where I should be looking/what I'm doing wrong?

My problem was that actually my minIO did not accept http requests. My already working spark submit job was using https using SSL, so I added the needed parameters to $SPARK_DAEMON_JAVA_OPTS and it was working.

Related

How to spark-submit to ZooKeeper-managed Mesos cluster (gives java.net.UnknownHostException: zk for mesos://zk:// master URL)?

I'm running Spark 2.0.2 and Mesos 0.28.2.
I'm attempting to submit an application to Spark, using a ZooKeeper-managed Mesos cluster as the master:
$SPARK_HOME/bin/spark-submit --verbose \
--conf spark.mesos.executor.docker.image=$DOCKER_IMAGE \
--conf spark.mesos.executor.home=$SPARK_HOME \
--conf spark.executorEnv.MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so \
--deploy-mode cluster \
--master mesos://zk://<ip 1>:2181,<ip 2>:2181,<ip 3>:2181/mesos \
--class $APP_MAIN_CLASS \
file://$APP_JAR_PATH
(<ip 1>, <ip 2>, and <ip 3> are IPv4 addresses in the 10.0.0.0/8 block)
According to the documentation, I seem to have the right format for the master:
The Master URLs for Mesos are in the form mesos://host:5050 for a single-master Mesos cluster, or mesos://zk://host1:2181,host2:2181,host3:2181/mesos for a multi-master Mesos cluster using ZooKeeper.
However, it appears that Spark is reading the mesos://zk://... string then attempting to connect to zk:
17/04/07 20:10:06 INFO RestSubmissionClient: Submitting a request to launch an application in mesos://zk://<ip 1>:2181,<ip 2>:2181,<ip 3>:2181/mesos.
Exception in thread "main" java.net.UnknownHostException: zk
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
at org.apache.spark.deploy.rest.RestSubmissionClient.org$apache$spark$deploy$rest$RestSubmissionClient$$postJson(RestSubmissionClient.scala:214)
at org.apache.spark.deploy.rest.RestSubmissionClient$$anonfun$createSubmission$3.apply(RestSubmissionClient.scala:89)
at org.apache.spark.deploy.rest.RestSubmissionClient$$anonfun$createSubmission$3.apply(RestSubmissionClient.scala:85)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.deploy.rest.RestSubmissionClient.createSubmission(RestSubmissionClient.scala:85)
at org.apache.spark.deploy.rest.RestSubmissionClient$.run(RestSubmissionClient.scala:417)
at org.apache.spark.deploy.rest.RestSubmissionClient$.main(RestSubmissionClient.scala:430)
at org.apache.spark.deploy.rest.RestSubmissionClient.main(RestSubmissionClient.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
How do I get Spark to recognize that it should be using the three ZooKeeper nodes rather than trying to connect to a non-existent zk host?
tl;dr It won't work unless you either change --deploy-mode to client or use a master URL with a single Mesos host, e.g. mesos://host:port.
The following line gives the hint where to find the relevant code.
17/04/07 20:10:06 INFO RestSubmissionClient: Submitting a request to launch an application in mesos://zk://:2181,:2181,:2181/mesos.
It looks that the message is only printed out for --deploy-mode cluster with Spark Standalone and Apache Mesos. Change it to the default client and the deployment path will change and hopefully accept the master URL.
See yourself the code that's responsible for the cluster deployment -- RestSubmissionClient.
Here RestSubmissionClient says:
private val supportedMasterPrefixes = Seq("spark://", "mesos://")
which proves mesos:// URLs are covered, but here you see the following:
private val masters: Array[String] = if (master.startsWith("spark://")) {
Utils.parseStandaloneMasterUrls(master)
} else {
Array(master)
}
that is printed out here as the above INFO message that shows the URL can only be a single Mesos master.

Spark Streaming - java.io.IOException: Lease timeout of 0 seconds expired

I have spark streaming application using checkpoint writing on HDFS.
Has anyone know the solution?
Previously we were using the kinit to specify principal and keytab and got the suggestion to specify these via spark-submit command instead kinit but still this error and cause spark streaming application down.
spark-submit --principal sparkuser#HADOOP.ABC.COM --keytab /home/sparkuser/keytab/sparkuser.keytab --name MyStreamingApp --master yarn-cluster --conf "spark.driver.extraJavaOptions=-XX:+UseConcMarkSweepGC --conf "spark.eventLog.enabled=true" --conf "spark.streaming.backpressure.enabled=true" --conf "spark.streaming.stopGracefullyOnShutdown=true" --conf "spark.executor.extraJavaOptions=-XX:+UseConcMarkSweepGC --class com.abc.DataProcessor myapp.jar
I see multiple occurrences of following exception in logs and finally SIGTERM 15 that kills the executor and driver. We are using CDH 5.5.2
2016-10-02 23:59:50 ERROR SparkListenerBus LiveListenerBus:96 -
Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:148)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:148)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:148)
at org.apache.spark.scheduler.EventLoggingListener.onUnpersistRDD(EventLoggingListener.scala:184)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1135)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Lease timeout of 0 seconds expired.
at org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:2370)
at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:964)
at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:932)
at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:423)
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:448)
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:304)
at java.lang.Thread.run(Thread.java:745)

getting CSV sink metrics files from spark-submit at run time

Having metrics.properties in /conf (enabling CSV sink) as follows (see configuration below), collects metrics every time you submit a job (using spark-submit) and it works by saving it to /tmp/
# Enable CsvSink for all instances
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
# Polling period for CsvSink
*.sink.csv.period=1
*.sink.csv.unit=minutes
# Polling directory for CsvSink
*.sink.csv.directory=/tmp/
# Worker instance overlap polling period
worker.sink.csv.period=1
worker.sink.csv.unit=minutes
Now I want to give metrics.properties file at run time (using the same configuration as above), and I gave the arguments for spark-submit as follows:
$spark_home/bin/spark-submit --files=file:///home/log_properties/metrics.properties --conf spark.metrics.conf=./metrics.properties --class com.myClass job1.jar
And I get the following warning and I don't have any Graphite configuration in my metrics.properties file (I just used the metrics.template and enabled the above csv configurations only)
WARN graphite.GraphiteReporter: Unable to report to Graphite
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.<init>(Socket.java:434)
at java.net.Socket.<init>(Socket.java:244)
at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
at com.codahale.metrics.graphite.Graphite.connect(Graphite.java:118)
at com.codahale.metrics.graphite.GraphiteReporter.report(GraphiteReporter.java:167)
at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162)
at org.apache.spark.metrics.sink.GraphiteSink.report(GraphiteSink.scala:91)
at org.apache.spark.metrics.MetricsSystem$$anonfun$report$1.apply(MetricsSystem.scala:114)
at org.apache.spark.metrics.MetricsSystem$$anonfun$report$1.apply(MetricsSystem.scala:114)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.metrics.MetricsSystem.report(MetricsSystem.scala:114)
at org.apache.spark.SparkContext$$anonfun$stop$3.apply$mcV$sp(SparkContext.scala:1715)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1219)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1714)
at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:596)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
Is it defaulted to report to Graphite and is ignoring my metrics.properties (which only is enabled for CSV sink)????
pass the conf like this -Dspark.metrics.conf=metrics.properties not via --conf spark.metrics.conf=./metrics.properties
Thats the reason why even though your file is added it is not used for the metrics config, it instead uses the default metrics.properties
yeah I realized I had metrics.properties file locally (from the directory where I run the spark-submit ) but what I passed i.e.
--files=file:///home/log_properties/metrics.properties in the spark-submit doesn't... while I resolved the issue by updating the local file (removing the Graphite flags). I am still puzzled on why it should care about the local file (metrics.properties) when I have already passed the metrics.properties that I want to use for my job.

spark-submit classpath issue with --repositories --packages options

I'm running Spark in a standalone cluster where spark master, worker and submit each run in there own Docker container.
When spark-submit my Java App with the --repositories and --packages options I can see that it successfully downloads the apps required dependencies. However the stderr logs on the spark workers web ui reports a java.lang.ClassNotFoundException: kafka.serializer.StringDecoder. This class is available in one of the dependencies downloaded by spark-submit. But doesn't look like it's available on the worker classpath??
16/02/22 16:17:09 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoClassDefFoundError: kafka/serializer/StringDecoder
at com.my.spark.app.JavaDirectKafkaWordCount.main(JavaDirectKafkaWordCount.java:71)
... 6 more
Caused by: java.lang.ClassNotFoundException: kafka.serializer.StringDecoder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
The spark-submit call:
${SPARK_HOME}/bin/spark-submit --deploy-mode cluster \
--master spark://spark-master:7077 \
--repositories https://oss.sonatype.org/content/groups/public/ \
--packages org.apache.spark:spark-streaming-kafka_2.10:1.6.0,org.elasticsearch:elasticsearch-spark_2.10:2.2.0 \
--class com.my.spark.app.JavaDirectKafkaWordCount \
/app/spark-app.jar kafka-server:9092 mytopic
I was working with Spark 2.4.0 when I ran into this problem. I don't have a solution yet but just some observations based on experimentation and reading around for solutions. I am noting them down here just in case it helps some one in their investigation. I will update this answer if I find more information later.
The --repositories option is required only if some custom repository has to be referenced
By default the maven central repository is used if the --repositories option is not provided
When --packages option is specified, the submit operation tries to look for the packages and their dependencies in the ~/.ivy2/cache, ~/.ivy2/jars, ~/.m2/repository directories.
If they are not found, then they are downloaded from maven central using ivy and stored under the ~/.ivy2 directory.
In my case I had observed that
spark-shell worked perfectly with the --packages option
spark-submit would fail to do the same. It would download the dependencies correctly but fail to pass on the jars to the driver and worker nodes
spark-submit worked with the --packages option if I ran the driver locally using --deploy-mode client instead of cluster.
This would run the driver locally in the command shell where I ran the spark-submit command but the worker would run on the cluster with the appropriate dependency jars
I found the following discussion useful but I still have to nail down this problem.
https://github.com/databricks/spark-redshift/issues/244#issuecomment-347082455
Most people just use an UBER jar to avoid running into this problem and even to avoid the problem of conflicting jar versions where a different version of the same dependency jar is provided by the platform.
But I don't like that idea beyond a stop gap arrangement and am still looking for a solution.

How to give dependent jars to spark submit in cluster mode

I am running spark using cluster mode for deployment . Below is the command
JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\
$JARS_HOME/rabbitmq-0.1.0-RELEASE.jar,\
$JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\
$JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\
$JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar
dse spark-submit -v --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \
--executor-memory 512M \
--total-executor-cores 3 \
--deploy-mode "cluster" \
--master spark://$MASTER:7077 \
--jars=$JARS \
--supervise \
--class "com.testclass" $APP_JAR input.json \
--files "/home/test/input.json"
The above command is working fine in client mode. But when I use it in cluster mode I get class not found exception
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils$
In client mode the dependent jars are getting copied to the /var/lib/spark/work directory whereas in cluster mode it is not. Please help me in getting this solved.
EDIT:
I am using nfs and I have mounted the same directory on all the spark nodes under same name. Still I get the error. How it is able to pick the application jar which is also under same directory but not the dependent jars ?
In client mode the dependent jars are getting copied to the
/var/lib/spark/work directory whereas in cluster mode it is not.
In Cluster mode, driver pragram is running in the cluster not in local(compared to client mode) and dependent jars should be accessible in cluster, otherwise driver program and executor will throw "java.lang.NoClassDefFoundError" exception.
Actually When using spark-submit, the application jar along with any jars included with the --jars option will be automatically transferred to the cluster.
Your extra jars could be added to --jars, they will be copied to cluster automatically.
please refer to "Advanced Dependency Management" section in below link:
http://spark.apache.org/docs/latest/submitting-applications.html
As spark documentation says,
Keep all jars and dependencies in same local path in all nodes in cluster or
Keep the jar is distributed files system where all nodes have access to.
Spark properties

Resources