Zeppelin Pyspark on HDP 2.3 giving error - apache-spark

I am trying to configure zeppelin to work with HDP 2.3 (Spark 1.3). I have successfully installed zeppelin via Ambari and the zeppelin service is running.
But when I am trying to run any %pyspark command I am getting the below error.
I read few blogs but seems like there is some issue with jar being compiled on Java 6 and Java 7 that are being shared between Python and Spark.
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 7, sandbox.hortonworks.com): org.apache.spark.SparkException:
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/opt/incubator-zeppelin/interpreter/spark/zeppelin-spark-0.6.0-incubating-SNAPSHOT.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:105)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.\n', JavaObject id=o68), <traceback object at 0x2618bd8>)
Took 0 seconds

Can you check in your zeppelin-env.sh if you have the below line?
export PYTHONPATH=${SPARK_HOME}/python
If missing, this can be added via Ambari under Zeppelin > Configs > Advanced zeppelin-env > zeppelin-env template
Although, if you installed using the latest version of Ambari service for zeppelin then it should have done this for you:
https://github.com/hortonworks-gallery/ambari-zeppelin-service/blob/master/configuration/zeppelin-env.xml#L63

I just setup a fresh HDP 2.3 setup (2.3.0.0-2557) on Centos 6.5 using Ambari 2.1 and installed zeppelin using Ambari zeppelin service (using default configs). Pyspark seems to work fine for me.
Based on your error it sounds like PYTHONPATH is not getting set to the correct value:
PYTHONPATH was:
/opt/incubator-zeppelin/interpreter/spark/zeppelin-spark-0.6.0-incubating-SNAPSHOT.jar
In zeppelin can you enter the below in a cell and run it and provide the output?
System.getenv().get("MASTER")
System.getenv().get("SPARK_YARN_JAR")
System.getenv().get("HADOOP_CONF_DIR")
System.getenv().get("JAVA_HOME")
System.getenv().get("SPARK_HOME")
System.getenv().get("PYSPARK_PYTHON")
System.getenv().get("PYTHONPATH")
System.getenv().get("ZEPPELIN_JAVA_OPTS")
Here is the output on my setup:
res41: String = yarn-client
res42: String = hdfs:///apps/zeppelin/zeppelin-spark-0.6.0-SNAPSHOT.jar
res43: String = /etc/hadoop/conf
res44: String = /usr/java/default
res45: String = /usr/hdp/current/spark-client/
res46: String = null
res47: String = /usr/hdp/current/spark-client//python:/usr/hdp/current/spark-client//python/lib/pyspark.zip:/usr/hdp/current/spark-client//python/lib/py4j-0.8.2.1-src.zip
res48: String = -Dhdp.version=2.3.0.0-2557 -Dspark.executor.memory=512m -Dspark.yarn.queue=default

Related

Cannot run livy in sparkR mode on zeppelin 0.8

We've install the zeppelin that is one of HDP 3.1.5 service and install the suitable R version (from RHEL 7 repos) that compatible with the zeppelin version (0.8).
At first, when we scripting and running the Spark code using spark interpreter especially using sparkR one, it runs well but when we change the interpreter into livy, we always got an error with message "Fail to start interpreter".
When we check into Yarn UI, the application is created and running but when it's checked into spark ui, we've got the stderror that said
22/07/28 10:21:17 WARN SparkRInterpreter$: Fail to init Spark RBackend, using different method signature
java.lang.ClassCastException: scala.Tuple2 cannot be cast to java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
at org.apache.livy.repl.SparkRInterpreter$$anon$1.run(SparkRInterpreter.scala:88)
22/07/28 10:21:17 WARN Session: Fail to start interpreter sparkr
java.io.IOException: Cannot run program "R": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.livy.repl.SparkRInterpreter$.apply(SparkRInterpreter.scala:143)
at org.apache.livy.repl.Session.liftedTree1$1(Session.scala:107)
at org.apache.livy.repl.Session.interpreter(Session.scala:98)
at org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)
at org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
at org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 11 more
Is there any step/configuration that we skip?
-Thanks-

Apache Beam Issue with Spark Runner while using Kafka IO

I am trying to test KafkaIO for the Apache Beam Code with a Spark Runner.
The code works fine with a Direct Runner.
However, if I add below codeline it throws error:
options.setRunner(SparkRunner.class);
Error:
ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 2.0 (TID 0)
java.lang.StackOverflowError
at java.base/java.io.ObjectInputStream$BlockDataInputStream.readByte(ObjectInputStream.java:3307)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2135)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:482)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:440)
at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:488)
at jdk.internal.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
Versions that I am trying to use:
<beam.version>2.33.0</beam.version>
<spark.version>3.1.2</spark.version>
<kafka.version>3.0.0</kafka.version>
This issue is resolved by adding VM argument: -Xss2M
This link helped me to solve this issue:
https://github.com/eclipse-openj9/openj9/issues/10370

NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V

I am using <spark.version>3.0.2</spark.version> with detla version 0.8.0 in my project.
and running with
export SPARK_HOME=/pkg/spark-3.0.2-bin-hadoop2.7-hive1.2
$SPARK_HOME/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--packages io.delta:delta-core_2.12:0.8.0,org.apache.hadoop:hadoop-common:2.9.2,org.apache.hadoop:hadoop-aws:2.9.2,org.apache.hudi:hudi-spark-bundle_2.12:0.6.0
I am getting below error
User class threw exception: java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V
at org.apache.hadoop.fs.s3a.S3AFileSystem.addDeprecatedKeys(S3AFileSystem.java:183)
at org.apache.hadoop.fs.s3a.S3AFileSystem.<clinit>(S3AFileSystem.java:187)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:171)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:232)
What is wrong here ?? any clue how to fix it ?
Tried with
export SPARK_HOME=/pkg/spark-3.0.2-bin-hadoop2.9.1-custom
Getting error
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 ($anonfun$call$1 at DatabricksLogging.scala:77) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: java.lang.IllegalArgumentException: Unknown message type: 9 at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007)
at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1602)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2236)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188)
at org.apache.spark.sql.delta.DeltaLog$.recordDeltaOperation(DeltaLog.scala:368)
at org.apache.spark.sql.delta.DeltaLog$$anon$3.call(DeltaLog.scala:470)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
Your Spark is compiled with Hadoop 2.7, but you're trying to execute your code with Hadoop 2.9. Remove Hadoop 2.9 coordinates from --packages command

Apache Spark on Databricks Caused by: scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found

Whenever I attempt to excute any command (for example %fs rm -r /mnt/driver-daemon/jars/), on Apache Spark on Databricks Community Edition I get the following error:
java.lang.Exception: An error occurred while initializing the REPL. Please check whether there are conflicting Scala libraries or JARs attached to the cluster, such as Scala 2.11 libraries attached to Scala 2.10 cluster (or vice-versa).
When I look into the error logs I see the problem is caused by:
Caused by: scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found.
The full error is as follows:
at com.databricks.backend.daemon.driver.DriverILoop.initSpark(DriverILoop.scala:60)
at com.databricks.backend.daemon.driver.DriverILoop.initializeSpark(DriverILoop.scala:185)
at com.databricks.backend.daemon.driver.DriverILoop.createInterpreterForWeb(DriverILoop.scala:165)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.createInterpreter(ScalaDriverLocal.scala:417)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.interp(ScalaDriverLocal.scala:434)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$9.apply(DriverLocal.scala:396)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$9.apply(DriverLocal.scala:373)
at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:275)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644)
at scala.util.Try$.apply(Try.scala:192)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:639)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:485)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:597)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:390)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
at java.lang.Thread.run(Thread.java:748)
Caused by: scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found.
at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:17)
at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:18)
at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:53)
at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:45)
at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:45)
at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:66)
at scala.reflect.internal.Mirrors$RootsBase.getClassByName(Mirrors.scala:102)
at scala.reflect.internal.Mirrors$RootsBase.getRequiredClass(Mirrors.scala:105)
at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass$lzycompute(Definitions.scala:257)
at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass(Definitions.scala:257)
at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1390)
at scala.tools.nsc.Global$Run.<init>(Global.scala:1242)
at scala.tools.nsc.interpreter.IMain.compileSourcesKeepingRun(IMain.scala:439)
at scala.tools.nsc.interpreter.DriverIMain.compileSourcesKeepingRun(DriverIMain.scala:305)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.compileAndSaveRun(IMain.scala:862)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.compile(IMain.scala:820)
at scala.tools.nsc.interpreter.DriverIMain.bind(DriverIMain.scala:84)
at com.databricks.backend.daemon.driver.DriverILoop.bind(DriverILoop.scala:191)
at com.databricks.backend.daemon.driver.DatabricksILoop$class.initSpark(DatabricksILoop.scala:87)
at com.databricks.backend.daemon.driver.DriverILoop.initSpark(DriverILoop.scala:60)
at com.databricks.backend.daemon.driver.DriverILoop.initializeSpark(DriverILoop.scala:185)
at com.databricks.backend.daemon.driver.DriverILoop.createInterpreterForWeb(DriverILoop.scala:165)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.createInterpreter(ScalaDriverLocal.scala:417)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.interp(ScalaDriverLocal.scala:434)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$9.apply(DriverLocal.scala:396)
at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$9.apply(DriverLocal.scala:373)
at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:238)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:233)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:275)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644)
at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:644)
at scala.util.Try$.apply(Try.scala:192)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:639)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:485)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:597)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:390)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
at java.lang.Thread.run(Thread.java:748)```
Any thoughts on how to go about resolving this issue?
execute java -version in command prompt / linux terminal. If you are executing code in intellij or any other IDE, check the project settings and SDK used. If you are submitting it to a spark master, check the java version used in the spark master using the same command.
Java version 8 is compatible with scala 2.11 / 2.12 / 2.13

DSE 4.8.8 upgrade throws spark cassandra connector config variables error

I have recently upgraded my DSE version from 4.8.4 to 4.8.8 and I am seeing the following unexpected result.
I am using : spark-cassandra-connector-java_2.10: 1.4.2
spark-streaming_2.10: 1.4.1
spark-core_2.10 1.4.1
WARN 2016-06-23 21:28:50,132 org.apache.spark.scheduler.TaskSetManager: Lost task 3.0 in stage 3365.0 (TID 6731, 172.31.17.116): java.lang.ExceptionInInitializerError
at com.dynosense.dynospark.SparkApp$2.call(SparkApp.java:107)
at com.dynosense.dynospark.SparkApp$2.call(SparkApp.java:103)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1027)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1508)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1099)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1099)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1792)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1792)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.spark.connector.util.ConfigCheck$ConnectorConfigurationException: Invalid Config Variables
Only known spark.cassandra.* variables are allowed when using the Spark Cassandra Connector.
spark.cassandra.connection.host.autoUpdate is not a valid Spark Cassandra Connector variable.
No likely matches found.
at com.datastax.spark.connector.util.ConfigCheck$.checkConfig(ConfigCheck.scala:46)
at com.datastax.spark.connector.cql.CassandraConnectorConf$.apply(CassandraConnectorConf.scala:171)
at com.datastax.spark.connector.cql.CassandraConnector$.apply(CassandraConnector.scala:192)
at com.datastax.spark.connector.cql.CassandraConnector.apply(CassandraConnector.scala)
at com.dynosense.dynospark.SparkApp.<clinit>(SparkApp.java:62)
... 15 more

Resources