Datastax Spark: Job failed on Zeppelin - apache-spark
I have set up Datastax Enterprice in three nodes in the local network.
Two nodes are debian servers and i used apt package manager for installation. The last node is iMac and i used the .dmg package for installation.
Node #1:
OS: Debian GNU/Linux 8.10 (jessie)
Local IP: 172.16.21.18
Datastax Enterprice: 5.1.7
Node #2:
OS: Ubuntu 16.04.3 LTS
Local IP: 172.16.21.25
Datastax Enterprice: 5.1.7
Node #1:
OS: macOS 10.13.2
Local IP: 192.168.1.108
Datastax Enterprice: 5.1.7
Nodes are up and running in analytics and search mode: ($ dse cassandra -k -s)
Now, I'm trying to connect on Spark Cluster using Apache Zeppelin 0.7.3. Apache Zeppelin is installed and configured in Node #1.
I followed these instructions for configuration. Below you can see some basic changes in config files:
zeppelin-0.7.3-bin-all/conf/zeppelin-env.sh
[..]
export MASTER=spark://172.16.21.18:7077 # Spark master url. eg. spark://master_addr:7077. Leave empty if you want to use local mode.
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export DSE_HOME=/usr
[..]
zeppelin-0.7.3-bin-all/bin/interpreter.sh
[..]
# set spark related env variables
if [[ "${INTERPRETER_ID}" == "spark" ]]; then
if [[ -n "${SPARK_HOME}" ]]; then
export SPARK_SUBMIT="${DSE_HOME}/bin/dse spark-submit"
[..]
Zeppelin Spark Intepreter:
Zeppelin CQL intepreter works perfect with Apache Cassandra but then i'm trying to use Spark Intepreter to execute some queries i'm getting this error:
%spark
val results = spark.sql("SELECT * from keyspace.table")
java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
[..]
complete zeppelin log file:
INFO [2018-02-21 04:25:36,185] ({Thread-0} RemoteInterpreterServer.java[run]:97) - Starting remote interpreter server on port 52127
INFO [2018-02-21 04:25:36,562] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkInterpreter
INFO [2018-02-21 04:25:36,589] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkSqlInterpreter
INFO [2018-02-21 04:25:36,601] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.DepInterpreter
INFO [2018-02-21 04:25:36,619] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.PySparkInterpreter
INFO [2018-02-21 04:25:36,622] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkRInterpreter
INFO [2018-02-21 04:25:36,683] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:131) - Job remoteInterpretJob_1519205136682 started by scheduler org.apache.zeppelin.spark.SparkInterpreter269729544
INFO [2018-02-21 04:25:40,733] ({pool-2-thread-2} SparkInterpreter.java[createSparkSession]:318) - ------ Create new SparkContext spark://172.16.21.18:7077 -------
WARN [2018-02-21 04:25:40,740] ({pool-2-thread-2} SparkInterpreter.java[setupConfForSparkR]:577) - sparkr.zip is not found, sparkr may not work.
INFO [2018-02-21 04:25:40,786] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Running Spark version 2.1.0
WARN [2018-02-21 04:25:41,760] ({pool-2-thread-2} NativeCodeLoader.java[<clinit>]:62) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARN [2018-02-21 04:25:41,958] ({pool-2-thread-2} Logging.scala[logWarning]:66) -
SPARK_CLASSPATH was detected (set to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
WARN [2018-02-21 04:25:41,959] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Setting 'spark.executor.extraClassPath' to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:' as a work-around.
WARN [2018-02-21 04:25:41,960] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Setting 'spark.driver.extraClassPath' to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:' as a work-around.
WARN [2018-02-21 04:25:41,986] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Your hostname, XPLAIN005 resolves to a loopback address: 127.0.1.1; using 172.16.21.18 instead (on interface eth0)
WARN [2018-02-21 04:25:41,987] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Set SPARK_LOCAL_IP if you need to bind to another address
INFO [2018-02-21 04:25:42,017] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing view acls to: cassandra
INFO [2018-02-21 04:25:42,017] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing modify acls to: cassandra
INFO [2018-02-21 04:25:42,018] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing view acls groups to:
INFO [2018-02-21 04:25:42,019] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing modify acls groups to:
INFO [2018-02-21 04:25:42,019] ({pool-2-thread-2} Logging.scala[logInfo]:54) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cassandra); groups with view permissions: Set(); users with modify permissions: Set(cassandra); groups with modify permissions: Set()
INFO [2018-02-21 04:25:42,417] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'sparkDriver' on port 51240.
INFO [2018-02-21 04:25:42,445] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering MapOutputTracker
INFO [2018-02-21 04:25:42,476] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering BlockManagerMaster
INFO [2018-02-21 04:25:42,481] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
INFO [2018-02-21 04:25:42,482] ({pool-2-thread-2} Logging.scala[logInfo]:54) - BlockManagerMasterEndpoint up
INFO [2018-02-21 04:25:42,507] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Created local directory at /tmp/blockmgr-797ea400-69f1-4228-a6da-fe424edce8d4
INFO [2018-02-21 04:25:42,524] ({pool-2-thread-2} Logging.scala[logInfo]:54) - MemoryStore started with capacity 408.9 MB
INFO [2018-02-21 04:25:42,591] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering OutputCommitCoordinator
INFO [2018-02-21 04:25:42,700] ({pool-2-thread-2} Log.java[initialized]:186) - Logging initialized #6930ms
INFO [2018-02-21 04:25:42,864] ({pool-2-thread-2} Server.java[doStart]:327) - jetty-9.2.z-SNAPSHOT
INFO [2018-02-21 04:25:42,902] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2cbd702d{/jobs,null,AVAILABLE}
INFO [2018-02-21 04:25:42,903] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#240b993c{/jobs/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,903] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5b7d8292{/jobs/job,null,AVAILABLE}
INFO [2018-02-21 04:25:42,908] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#4c2353ff{/jobs/job/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,909] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#bd87e4e{/stages,null,AVAILABLE}
INFO [2018-02-21 04:25:42,910] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#73e2d470{/stages/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,917] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#44bca18c{/stages/stage,null,AVAILABLE}
INFO [2018-02-21 04:25:42,918] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#1256be4f{/stages/stage/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,919] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5a349845{/stages/pool,null,AVAILABLE}
INFO [2018-02-21 04:25:42,919] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#3f108627{/stages/pool/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,926] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#1e01f088{/storage,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#390281c1{/storage/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#470ac014{/storage/rdd,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#7c90476c{/storage/rdd/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,928] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#6d847dc6{/environment,null,AVAILABLE}
INFO [2018-02-21 04:25:42,936] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#40a5e53e{/environment/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,937] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#513e975e{/executors,null,AVAILABLE}
INFO [2018-02-21 04:25:42,937] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2f6b1132{/executors/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,938] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#61cf2354{/executors/threadDump,null,AVAILABLE}
INFO [2018-02-21 04:25:42,939] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#eacb646{/executors/threadDump/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,951] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2b8d44aa{/static,null,AVAILABLE}
INFO [2018-02-21 04:25:42,953] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5c982268{/,null,AVAILABLE}
INFO [2018-02-21 04:25:42,954] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#44556f2c{/api,null,AVAILABLE}
INFO [2018-02-21 04:25:42,955] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2fa0ef66{/jobs/job/kill,null,AVAILABLE}
INFO [2018-02-21 04:25:42,955] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#6e49562c{/stages/stage/kill,null,AVAILABLE}
INFO [2018-02-21 04:25:42,970] ({pool-2-thread-2} AbstractConnector.java[doStart]:266) - Started ServerConnector#53405611{HTTP/1.1}{0.0.0.0:4040}
INFO [2018-02-21 04:25:42,971] ({pool-2-thread-2} Server.java[doStart]:379) - Started #7201ms
INFO [2018-02-21 04:25:42,971] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'SparkUI' on port 4040.
INFO [2018-02-21 04:25:42,974] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Bound SparkUI to 0.0.0.0, and started at http://172.16.21.18:4040
INFO [2018-02-21 04:25:43,214] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Added file file:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/pyspark.zip at spark://172.16.21.18:51240/files/pyspark.zip with timestamp 1519205143214
INFO [2018-02-21 04:25:43,217] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Copying /home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/pyspark.zip to /tmp/spark-2e9292e3-8c4d-445a-92f0-7d54188818db/userFiles-4e8301a5-91bc-4753-8436-6cced0bdc5c5/pyspark.zip
INFO [2018-02-21 04:25:43,226] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Added file file:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/py4j-0.10.4-src.zip at spark://172.16.21.18:51240/files/py4j-0.10.4-src.zip with timestamp 1519205143226
INFO [2018-02-21 04:25:43,227] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Copying /home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/py4j-0.10.4-src.zip to /tmp/spark-2e9292e3-8c4d-445a-92f0-7d54188818db/userFiles-4e8301a5-91bc-4753-8436-6cced0bdc5c5/py4j-0.10.4-src.zip
INFO [2018-02-21 04:25:43,279] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Created default pool default, schedulingMode: FIFO, minShare: 0, weight: 1
INFO [2018-02-21 04:25:43,325] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
INFO [2018-02-21 04:25:43,391] ({netty-rpc-connection-0} TransportClientFactory.java[createClient]:250) - Successfully created connection to /172.16.21.18:7077 after 33 ms (0 ms spent in bootstraps)
INFO [2018-02-21 04:26:03,326] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
INFO [2018-02-21 04:26:23,326] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
ERROR [2018-02-21 04:26:43,328] ({appclient-registration-retry-thread} Logging.scala[logError]:70) - Application has been killed. Reason: All masters are unresponsive! Giving up.
WARN [2018-02-21 04:26:43,328] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Application ID is not initialized yet.
INFO [2018-02-21 04:26:43,336] ({stop-spark-context} AbstractConnector.java[doStop]:306) - Stopped ServerConnector#53405611{HTTP/1.1}{0.0.0.0:4040}
INFO [2018-02-21 04:26:43,339] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40068.
INFO [2018-02-21 04:26:43,498] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Server created on 172.16.21.18:40068
INFO [2018-02-21 04:26:43,499] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#6e49562c{/stages/stage/kill,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,500] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2fa0ef66{/jobs/job/kill,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,501] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#44556f2c{/api,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,501] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5c982268{/,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,505] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2b8d44aa{/static,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,506] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#eacb646{/executors/threadDump/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,507] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#61cf2354{/executors/threadDump,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,508] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
INFO [2018-02-21 04:26:43,508] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2f6b1132{/executors/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,509] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#513e975e{/executors,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,510] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#40a5e53e{/environment/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,511] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#6d847dc6{/environment,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,511] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#7c90476c{/storage/rdd/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,512] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#470ac014{/storage/rdd,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#390281c1{/storage/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#1e01f088{/storage,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#3f108627{/stages/pool/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,514] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering BlockManager BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,514] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5a349845{/stages/pool,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,515] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#1256be4f{/stages/stage/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,515] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#44bca18c{/stages/stage,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,516] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#73e2d470{/stages/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,516] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#bd87e4e{/stages,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,517] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#4c2353ff{/jobs/job/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,517] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5b7d8292{/jobs/job,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,518] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#240b993c{/jobs/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,518] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2cbd702d{/jobs,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,521] ({dispatcher-event-loop-0} Logging.scala[logInfo]:54) - Registering block manager 172.16.21.18:40068 with 408.9 MB RAM, BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,522] ({stop-spark-context} Logging.scala[logInfo]:54) - Stopped Spark web UI at http://172.16.21.18:4040
INFO [2018-02-21 04:26:43,526] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registered BlockManager BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,527] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Initialized BlockManager: BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,530] ({stop-spark-context} Logging.scala[logInfo]:54) - Shutting down all executors
INFO [2018-02-21 04:26:43,546] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - Asking each executor to shut down
WARN [2018-02-21 04:26:43,561] ({dispatcher-event-loop-0} Logging.scala[logWarning]:66) - Drop UnregisterApplication(null) because has not yet connected to master
INFO [2018-02-21 04:26:43,583] ({dispatcher-event-loop-2} Logging.scala[logInfo]:54) - MapOutputTrackerMasterEndpoint stopped!
INFO [2018-02-21 04:26:43,596] ({stop-spark-context} Logging.scala[logInfo]:54) - MemoryStore cleared
INFO [2018-02-21 04:26:43,597] ({stop-spark-context} Logging.scala[logInfo]:54) - BlockManager stopped
INFO [2018-02-21 04:26:43,605] ({stop-spark-context} Logging.scala[logInfo]:54) - BlockManagerMaster stopped
INFO [2018-02-21 04:26:43,608] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - OutputCommitCoordinator stopped!
ERROR [2018-02-21 04:26:43,748] ({pool-2-thread-2} Logging.scala[logError]:91) - Error initializing SparkContext.
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:524)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:378)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:233)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:841)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-02-21 04:26:43,751] ({pool-2-thread-2} Logging.scala[logInfo]:54) - SparkContext already stopped.
ERROR [2018-02-21 04:26:43,751] ({pool-2-thread-2} Utils.java[invokeMethod]:40) -
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:378)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:233)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:841)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:524)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
... 20 more
INFO [2018-02-21 04:26:43,752] ({stop-spark-context} Logging.scala[logInfo]:54) - Successfully stopped SparkContext
INFO [2018-02-21 04:26:43,752] ({pool-2-thread-2} SparkInterpreter.java[createSparkSession]:379) - Created Spark session
ERROR [2018-02-21 04:26:43,753] ({pool-2-thread-2} Job.java[run]:181) - Job failed
java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-02-21 04:26:43,759] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1519205136682 finished by scheduler org.apache.zeppelin.spark.SparkInterpreter269729544
What do you think?
UPDATE:
All nodes upgraded to Datastax Enterprice 5.1.7
With DSE 5.1 any reference to the Spark Master should look like this example:
export MASTER=dse://1.20.300.10
ERROR [2018-02-21 04:26:43,328] ({appclient-registration-retry-thread} Logging.scala[logError]:70) - Application has been killed. Reason: All masters are unresponsive! Giving up.
WARN [2018-02-21 04:26:43,328] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Application ID is not initialized yet.
It seems the app is killed. Could you check the logs in spark master ?
Related
Getting NoClassDefFoundError using Spark with spark-cassandra-connector 3.1.0
I've been trying to submit a spark application but get the following exception: WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/jars/spark-unsafe_2.12-3.2.0.jar) to constructor java.nio.DirectByteBuffer(long,int) WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 21/11/13 13:17:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2021-11-13T13:17:46+0330 - INFO - Great Expectations logging enabled at 20 level by JupyterUX module. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 21/11/13 13:17:47 INFO SparkContext: Running Spark version 3.2.0 21/11/13 13:17:47 INFO ResourceUtils: ============================================================== 21/11/13 13:17:47 INFO ResourceUtils: No custom resources configured for spark.driver. 21/11/13 13:17:47 INFO ResourceUtils: ============================================================== 21/11/13 13:17:47 INFO SparkContext: Submitted application: examstat 21/11/13 13:17:47 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 21/11/13 13:17:47 INFO ResourceProfile: Limiting resource is cpu 21/11/13 13:17:47 INFO ResourceProfileManager: Added ResourceProfile id: 0 21/11/13 13:17:47 INFO SecurityManager: Changing view acls to: alisaberi 21/11/13 13:17:47 INFO SecurityManager: Changing modify acls to: alisaberi 21/11/13 13:17:47 INFO SecurityManager: Changing view acls groups to: 21/11/13 13:17:47 INFO SecurityManager: Changing modify acls groups to: 21/11/13 13:17:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(alisaberi); groups with view permissions: Set(); users with modify permissions: Set(alisaberi); groups with modify permissions: Set() 21/11/13 13:17:47 INFO Utils: Successfully started service 'sparkDriver' on port 62135. 21/11/13 13:17:47 INFO SparkEnv: Registering MapOutputTracker 21/11/13 13:17:47 INFO SparkEnv: Registering BlockManagerMaster 21/11/13 13:17:47 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 21/11/13 13:17:47 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 21/11/13 13:17:47 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 21/11/13 13:17:47 INFO DiskBlockManager: Created local directory at /private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/blockmgr-e6d2444c-2aa6-4690-ac82-7a4ab1d86b6b 21/11/13 13:17:47 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB 21/11/13 13:17:47 INFO SparkEnv: Registering OutputCommitCoordinator 21/11/13 13:17:47 INFO Utils: Successfully started service 'SparkUI' on port 4040. 21/11/13 13:17:47 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.3:4040 21/11/13 13:17:47 INFO SparkContext: Added JAR file:///Users/alisaberi/Desktop/test-great-expectations/spark-cassandra-connector-assembly_2.12-3.1.0.jar at spark://192.168.1.3:62135/jars/spark-cassandra-connector-assembly_2.12-3.1.0.jar with timestamp 1636796867038 21/11/13 13:17:47 INFO Executor: Starting executor ID driver on host 192.168.1.3 21/11/13 13:17:47 INFO Executor: Fetching spark://192.168.1.3:62135/jars/spark-cassandra-connector-assembly_2.12-3.1.0.jar with timestamp 1636796867038 21/11/13 13:17:47 INFO TransportClientFactory: Successfully created connection to /192.168.1.3:62135 after 42 ms (0 ms spent in bootstraps) 21/11/13 13:17:47 INFO Utils: Fetching spark://192.168.1.3:62135/jars/spark-cassandra-connector-assembly_2.12-3.1.0.jar to /private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/spark-3961cb18-dacf-4940-a5ff-36d1bbc2c3bb/userFiles-89f4f184-ba26-4a28-b83f-52cec85d7563/fetchFileTemp11862606911562884947.tmp 21/11/13 13:17:48 INFO Executor: Adding file:/private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/spark-3961cb18-dacf-4940-a5ff-36d1bbc2c3bb/userFiles-89f4f184-ba26-4a28-b83f-52cec85d7563/spark-cassandra-connector-assembly_2.12-3.1.0.jar to class loader 21/11/13 13:17:48 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62138. 21/11/13 13:17:48 INFO NettyBlockTransferService: Server created on 192.168.1.3:62138 21/11/13 13:17:48 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 21/11/13 13:17:48 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.3, 62138, None) 21/11/13 13:17:48 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.3:62138 with 434.4 MiB RAM, BlockManagerId(driver, 192.168.1.3, 62138, None) 21/11/13 13:17:48 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.3, 62138, None) 21/11/13 13:17:48 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.3, 62138, None) 21/11/13 13:17:48 WARN SparkSession: Cannot use com.datastax.spark.connector.CassandraSparkExtensions to configure session extensions. java.lang.NoClassDefFoundError: com/datastax/spark/connector/util/Logging at java.base/java.lang.ClassLoader.defineClass1(Native Method) at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1016) at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:151) at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:825) at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:723) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:646) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:604) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:576) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at java.base/java.lang.Class.forName0(Native Method) at java.base/java.lang.Class.forName(Class.java:468) at org.apache.spark.util.Utils$.classForName(Utils.scala:216) at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1(SparkSession.scala:1194) at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1$adapted(SparkSession.scala:1192) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$applyExtensions(SparkSession.scala:1192) at org.apache.spark.sql.SparkSession.<init>(SparkSession.scala:104) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.util.Logging at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:606) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 33 more 21/11/13 13:17:48 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir. 21/11/13 13:17:48 INFO SharedState: Warehouse path is 'file:/Users/alisaberi/Desktop/test-great-expectations/spark-warehouse'. /Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/context.py:77: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead. Traceback (most recent call last): File "/Users/alisaberi/Desktop/test-great-expectations/test.py", line 33, in <module> sqlContext.read\ File "/Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 164, in load File "/Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/python/lib/py4j-0.10.9.2-src.zip/py4j/java_gateway.py", line 1309, in __call__ File "/Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco File "/Users/alisaberi/Desktop/test-great-expectations/spark-3.2.0-bin-hadoop3.2/python/lib/py4j-0.10.9.2-src.zip/py4j/protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o56.load. : java.lang.NoClassDefFoundError: com/datastax/spark/connector/util/Logging at java.base/java.lang.ClassLoader.defineClass1(Native Method) at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1016) at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:151) at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:825) at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:723) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:646) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:604) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at org.apache.spark.sql.cassandra.DefaultSource.getTable(DefaultSource.scala:55) at org.apache.spark.sql.cassandra.DefaultSource.inferSchema(DefaultSource.scala:72) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.getTableFromProvider(DataSourceV2Utils.scala:81) at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:233) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.util.Logging at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:606) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 28 more 21/11/13 13:17:49 INFO SparkContext: Invoking stop() from shutdown hook 21/11/13 13:17:49 INFO SparkUI: Stopped Spark web UI at http://192.168.1.3:4040 21/11/13 13:17:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 21/11/13 13:17:49 INFO MemoryStore: MemoryStore cleared 21/11/13 13:17:49 INFO BlockManager: BlockManager stopped 21/11/13 13:17:49 INFO BlockManagerMaster: BlockManagerMaster stopped 21/11/13 13:17:49 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 21/11/13 13:17:49 INFO SparkContext: Successfully stopped SparkContext 21/11/13 13:17:49 INFO ShutdownHookManager: Shutdown hook called 21/11/13 13:17:49 INFO ShutdownHookManager: Deleting directory /private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/spark-ef03b69b-8170-49e1-a24f-af46ff8ada7d 21/11/13 13:17:49 INFO ShutdownHookManager: Deleting directory /private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/spark-3961cb18-dacf-4940-a5ff-36d1bbc2c3bb/pyspark-42c7c117-c948-4b16-82a6-39017769cff9 21/11/13 13:17:49 INFO ShutdownHookManager: Deleting directory /private/var/folders/4q/qc3xhr1x6qx5jr9604nl91w40000gn/T/spark-3961cb18-dacf-4940-a5ff-36d1bbc2c3bb The application use spark-cassandra-connector to read from cassandra. Here is the code: from pyspark.sql import SQLContext, SparkSession from pyspark.context import SparkContext spark = SparkSession\ .builder\ .appName("Test")\ .master('local[*]') \ .config('spark.cassandra.connection.host', 'localhost') \ .getOrCreate() spark.read\ .format("org.apache.spark.sql.cassandra")\ .options(table="gps", keyspace="test")\ .load().show() I've tried two different approaches to submit the application: $SPARK_HOME/bin/spark-submit --packages com.datastax.spark:spark-cassandra-connector_2.12:3.1.0 ./test.py $SPARK_HOME/bin/spark-submit --jars /Full/Path/to/spark-cassandra-connector-assembly_2.12-3.1.0.jar Also when I run the same code in pyspark shell, it works fine. Spark 3.2.0 spark-cassandra-connector 3.1.0 cassandra 4.0.1
Permission denied error when setting up local Spark instance and running pyspark
I am setting up a local Spark instance on Windows to use with PySpark as described in this guide (but with spark-3.0.0 / hadoop 2.7 instead): https://phoenixnap.com/kb/install-spark-on-windows-10. I can startup Spark with: C:\Spark\spark-3.0.0-bin-hadoop2.7\bin>spark-shell.cmd and connect to it with http://localhost:4040/ in my browser (I see the Spark GUI). But when am running the Python pyspark example with C:\Spark\spark-3.0.0-bin-hadoop2.7\examples>run-example SparkPi it throws an Permission Denied error like in this trace: 21/03/08 10:51:03 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 21/03/08 10:51:04 INFO SparkContext: Running Spark version 3.0.0 21/03/08 10:51:04 INFO ResourceUtils: ============================================================== 21/03/08 10:51:04 INFO ResourceUtils: Resources for spark.driver: 21/03/08 10:51:04 INFO ResourceUtils: ============================================================== 21/03/08 10:51:04 INFO SparkContext: Submitted application: Spark Pi 21/03/08 10:51:04 INFO SecurityManager: Changing view acls to: ##### 21/03/08 10:51:04 INFO SecurityManager: Changing modify acls to: ##### 21/03/08 10:51:04 INFO SecurityManager: Changing view acls groups to: 21/03/08 10:51:04 INFO SecurityManager: Changing modify acls groups to: 21/03/08 10:51:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(#####); groups with view permissions: Set(); users with modify permissions: Set(#####); groups with modify permissions: Set() 21/03/08 10:51:05 INFO Utils: Successfully started service 'sparkDriver' on port 63213. 21/03/08 10:51:05 INFO SparkEnv: Registering MapOutputTracker 21/03/08 10:51:05 INFO SparkEnv: Registering BlockManagerMaster 21/03/08 10:51:05 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 21/03/08 10:51:05 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 21/03/08 10:51:05 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 21/03/08 10:51:05 INFO DiskBlockManager: Created local directory at C:\Users\#####\AppData\Local\Temp\blockmgr-dce03954-27a7-484d-8e54-f552b21433f7 21/03/08 10:51:05 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB 21/03/08 10:51:05 INFO SparkEnv: Registering OutputCommitCoordinator 21/03/08 10:51:05 INFO Utils: Successfully started service 'SparkUI' on port 4040. 21/03/08 10:51:05 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://WORKSTATION.DOMAIN.EXT:4040 21/03/08 10:51:05 INFO SparkContext: Added JAR file:///C:/Spark/spark-3.0.0-bin-hadoop2.7/examples/jars/scopt_2.12-3.7.1.jar at spark://WORKSTATION.DOMAIN.EXT:63213/jars/scopt_2.12-3.7.1.jar with timestamp 1615197065578 21/03/08 10:51:05 INFO SparkContext: Added JAR file:///C:/Spark/spark-3.0.0-bin-hadoop2.7/examples/jars/spark-examples_2.12-3.0.0.jar at spark://WORKSTATION.DOMAIN.EXT:63213/jars/spark-examples_2.12-3.0.0.jar with timestamp 1615197065579 21/03/08 10:51:05 INFO Executor: Starting executor ID driver on host WORKSTATION.DOMAIN.EXT 21/03/08 10:51:05 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 63260. 21/03/08 10:51:05 INFO NettyBlockTransferService: Server created on WORKSTATION.DOMAIN.EXT:63260 21/03/08 10:51:05 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 21/03/08 10:51:05 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, NLLR4000250910.solon.prd, 63260, None) 21/03/08 10:51:05 INFO BlockManagerMasterEndpoint: Registering block manager NLLR4000250910.solon.prd:63260 with 366.3 MiB RAM, BlockManagerId(driver, WORKSTATION.DOMAIN.EXT, 63260, None) 21/03/08 10:51:05 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, NLLR4000250910.solon.prd, 63260, None) 21/03/08 10:51:05 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, NLLR4000250910.solon.prd, 63260, None) 21/03/08 10:51:06 INFO SparkContext: Starting job: reduce at SparkPi.scala:38 21/03/08 10:51:06 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions 21/03/08 10:51:06 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38) 21/03/08 10:51:06 INFO DAGScheduler: Parents of final stage: List() 21/03/08 10:51:06 INFO DAGScheduler: Missing parents: List() 21/03/08 10:51:06 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents 21/03/08 10:51:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KiB, free 366.3 MiB) 21/03/08 10:51:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1816.0 B, free 366.3 MiB) 21/03/08 10:51:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on WORKSTATION.DOMAIN.EXT:63260 (size: 1816.0 B, free: 366.3 MiB) 21/03/08 10:51:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1200 21/03/08 10:51:06 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1)) 21/03/08 10:51:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 21/03/08 10:51:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, WORKSTATION.DOMAIN.EXT, executor driver, partition 0, PROCESS_LOCAL, 7393 bytes) 21/03/08 10:51:06 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, WORKSTATION.DOMAIN.EXT, executor driver, partition 1, PROCESS_LOCAL, 7393 bytes) 21/03/08 10:51:06 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 21/03/08 10:51:06 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 21/03/08 10:51:06 INFO Executor: Fetching spark://WORKSTATION.DOMAIN.EXT:63213/jars/spark-examples_2.12-3.0.0.jar with timestamp 1615197065579 21/03/08 10:51:06 ERROR Utils: Aborting task java.io.IOException: Failed to connect to WORKSTATION.DOMAIN.EXT/192.168.#.#:63213 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195) at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392) at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411) at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359) at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719) at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535) at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869) at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860) at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: io.netty.channel.AbstractChannel$AnnotatedSocketException: Permission denied: no further information: WORKSTATION.DOMAIN.EXT/192.168.#.#:63213 Caused by: java.net.SocketException: Permission denied: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Unknown Source) [snip] When running it on a different machine with seemingly the same config where it works fine, I get this trace on the part where the Exception is thrown on the other trace: [snip] 21/03/08 08:00:22 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 21/03/08 08:00:22 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 21/03/08 08:00:22 INFO Executor: Fetching spark://WORKSTATION.DOMAIN.EXT:63646/jars/spark-examples_2.12-3.0.0.jar with timestamp 1615186820489 21/03/08 08:00:22 INFO TransportClientFactory: Successfully created connection to WORKSTATION.DOMAIN.EXT/10.121.#.#:63646 after 86 ms (0 ms spent in bootstraps) 21/03/08 08:00:22 INFO Utils: Fetching spark://WORKSTATION.DOMAIN.EXT:63646/jars/spark-examples_2.12-3.0.0.jar to C:\Users\#####\AppData\Local\Temp\spark-54a13d9f-9064-4f34-ba81-af49b18d9a0c\userFiles-24c3eabc-02a4-4aca-8abb-424431c6442f\fetchFileTemp5258763437798623210.tmp 21/03/08 08:00:24 INFO Executor: Adding file:/C:/Users/#####/AppData/Local/Temp/spark-54a13d9f-9064-4f34-ba81-af49b18d9a0c/userFiles-24c3eabc-02a4-4aca-8abb-424431c6442f/spark-examples_2.12-3.0.0.jar to class loader [snip] At first it seemed to me as a Firewall issue, but adding the executing java.exe as exeption to the firewall didn't solve the issue. Does anyone know what I should try next to get this issue resolved?
Finally I could solve it by setting my SPARK_LOCAL_IP to localhost in my environment variables: Go to your Windows environment variables and set SPARK_LOCAL_IP=localhost
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`product`' given input columns: [jsontostructs(message)];
C:\Users\sorun\.jdks\openjdk-14.0.1\bin\java.exe "-javaagent:D:\Intellij IDEA\IntelliJ IDEA 2020.1.1\lib\idea_rt.jar=50945:D:\Intellij IDEA\IntelliJ IDEA 2020.1.1\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\sorun\IdeaProjects\spark-streaming-kafka\target\classes;C:\Users\sorun\.m2\repository\org\apache\spark\spark-sql_2.11\2.2.0\spark-sql_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\com\univocity\univocity-parsers\2.2.1\univocity-parsers-2.2.1.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-sketch_2.11\2.2.0\spark-sketch_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-core_2.11\2.2.0\spark-core_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;C:\Users\sorun\.m2\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;C:\Users\sorun\.m2\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;C:\Users\sorun\.m2\repository\org\tukaani\xz\1.0\xz-1.0.jar;C:\Users\sorun\.m2\repository\org\apache\avro\avro-mapred\1.7.7\avro-mapred-1.7.7-hadoop2.jar;C:\Users\sorun\.m2\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7.jar;C:\Users\sorun\.m2\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7-tests.jar;C:\Users\sorun\.m2\repository\com\twitter\chill_2.11\0.8.0\chill_2.11-0.8.0.jar;C:\Users\sorun\.m2\repository\com\esotericsoftware\kryo-shaded\3.0.3\kryo-shaded-3.0.3.jar;C:\Users\sorun\.m2\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;C:\Users\sorun\.m2\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;C:\Users\sorun\.m2\repository\com\twitter\chill-java\0.8.0\chill-java-0.8.0.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-client\2.6.5\hadoop-client-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-common\2.6.5\hadoop-common-2.6.5.jar;C:\Users\sorun\.m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;C:\Users\sorun\.m2\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;C:\Users\sorun\.m2\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;C:\Users\sorun\.m2\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;C:\Users\sorun\.m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;C:\Users\sorun\.m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;C:\Users\sorun\.m2\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;C:\Users\sorun\.m2\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;C:\Users\sorun\.m2\repository\commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar;C:\Users\sorun\.m2\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;C:\Users\sorun\.m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-auth\2.6.5\hadoop-auth-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;C:\Users\sorun\.m2\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;C:\Users\sorun\.m2\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;C:\Users\sorun\.m2\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;C:\Users\sorun\.m2\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;C:\Users\sorun\.m2\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-hdfs\2.6.5\hadoop-hdfs-2.6.5.jar;C:\Users\sorun\.m2\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;C:\Users\sorun\.m2\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;C:\Users\sorun\.m2\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.5\hadoop-mapreduce-client-app-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.5\hadoop-mapreduce-client-common-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-yarn-client\2.6.5\hadoop-yarn-client-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.5\hadoop-yarn-server-common-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.5\hadoop-mapreduce-client-shuffle-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-yarn-api\2.6.5\hadoop-yarn-api-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.5\hadoop-mapreduce-client-core-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-yarn-common\2.6.5\hadoop-yarn-common-2.6.5.jar;C:\Users\sorun\.m2\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;C:\Users\sorun\.m2\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;C:\Users\sorun\.m2\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;C:\Users\sorun\.m2\repository\org\codehaus\jackson\jackson-xc\1.9.13\jackson-xc-1.9.13.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.5\hadoop-mapreduce-client-jobclient-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\hadoop\hadoop-annotations\2.6.5\hadoop-annotations-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-launcher_2.11\2.2.0\spark-launcher_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-network-common_2.11\2.2.0\spark-network-common_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-network-shuffle_2.11\2.2.0\spark-network-shuffle_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-unsafe_2.11\2.2.0\spark-unsafe_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\net\java\dev\jets3t\jets3t\0.9.3\jets3t-0.9.3.jar;C:\Users\sorun\.m2\repository\org\apache\httpcomponents\httpcore\4.3.3\httpcore-4.3.3.jar;C:\Users\sorun\.m2\repository\org\apache\httpcomponents\httpclient\4.3.6\httpclient-4.3.6.jar;C:\Users\sorun\.m2\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;C:\Users\sorun\.m2\repository\mx4j\mx4j\3.0.2\mx4j-3.0.2.jar;C:\Users\sorun\.m2\repository\javax\mail\mail\1.4.7\mail-1.4.7.jar;C:\Users\sorun\.m2\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;C:\Users\sorun\.m2\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;C:\Users\sorun\.m2\repository\net\iharder\base64\2.3.8\base64-2.3.8.jar;C:\Users\sorun\.m2\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;C:\Users\sorun\.m2\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;C:\Users\sorun\.m2\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;C:\Users\sorun\.m2\repository\com\google\guava\guava\16.0.1\guava-16.0.1.jar;C:\Users\sorun\.m2\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;C:\Users\sorun\.m2\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;C:\Users\sorun\.m2\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;C:\Users\sorun\.m2\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;C:\Users\sorun\.m2\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;C:\Users\sorun\.m2\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;C:\Users\sorun\.m2\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;C:\Users\sorun\.m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;C:\Users\sorun\.m2\repository\org\slf4j\slf4j-log4j12\1.7.16\slf4j-log4j12-1.7.16.jar;C:\Users\sorun\.m2\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;C:\Users\sorun\.m2\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;C:\Users\sorun\.m2\repository\net\jpountz\lz4\lz4\1.3.0\lz4-1.3.0.jar;C:\Users\sorun\.m2\repository\org\roaringbitmap\RoaringBitmap\0.5.11\RoaringBitmap-0.5.11.jar;C:\Users\sorun\.m2\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;C:\Users\sorun\.m2\repository\org\scala-lang\scala-library\2.11.8\scala-library-2.11.8.jar;C:\Users\sorun\.m2\repository\org\json4s\json4s-jackson_2.11\3.2.11\json4s-jackson_2.11-3.2.11.jar;C:\Users\sorun\.m2\repository\org\json4s\json4s-core_2.11\3.2.11\json4s-core_2.11-3.2.11.jar;C:\Users\sorun\.m2\repository\org\json4s\json4s-ast_2.11\3.2.11\json4s-ast_2.11-3.2.11.jar;C:\Users\sorun\.m2\repository\org\scala-lang\scalap\2.11.0\scalap-2.11.0.jar;C:\Users\sorun\.m2\repository\org\scala-lang\scala-compiler\2.11.0\scala-compiler-2.11.0.jar;C:\Users\sorun\.m2\repository\org\scala-lang\modules\scala-xml_2.11\1.0.1\scala-xml_2.11-1.0.1.jar;C:\Users\sorun\.m2\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.0.1\scala-parser-combinators_2.11-1.0.1.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;C:\Users\sorun\.m2\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;C:\Users\sorun\.m2\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;C:\Users\sorun\.m2\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;C:\Users\sorun\.m2\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;C:\Users\sorun\.m2\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;C:\Users\sorun\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;C:\Users\sorun\.m2\repository\io\netty\netty-all\4.0.43.Final\netty-all-4.0.43.Final.jar;C:\Users\sorun\.m2\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;C:\Users\sorun\.m2\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;C:\Users\sorun\.m2\repository\io\dropwizard\metrics\metrics-core\3.1.2\metrics-core-3.1.2.jar;C:\Users\sorun\.m2\repository\io\dropwizard\metrics\metrics-jvm\3.1.2\metrics-jvm-3.1.2.jar;C:\Users\sorun\.m2\repository\io\dropwizard\metrics\metrics-json\3.1.2\metrics-json-3.1.2.jar;C:\Users\sorun\.m2\repository\io\dropwizard\metrics\metrics-graphite\3.1.2\metrics-graphite-3.1.2.jar;C:\Users\sorun\.m2\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.6.5\jackson-module-scala_2.11-2.6.5.jar;C:\Users\sorun\.m2\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.5\jackson-module-paranamer-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;C:\Users\sorun\.m2\repository\oro\oro\2.0.8\oro-2.0.8.jar;C:\Users\sorun\.m2\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;C:\Users\sorun\.m2\repository\net\sf\py4j\py4j\0.10.4\py4j-0.10.4.jar;C:\Users\sorun\.m2\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-catalyst_2.11\2.2.0\spark-catalyst_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\scala-lang\scala-reflect\2.11.8\scala-reflect-2.11.8.jar;C:\Users\sorun\.m2\repository\org\codehaus\janino\janino\3.0.0\janino-3.0.0.jar;C:\Users\sorun\.m2\repository\org\codehaus\janino\commons-compiler\3.0.0\commons-compiler-3.0.0.jar;C:\Users\sorun\.m2\repository\org\antlr\antlr4-runtime\4.5.3\antlr4-runtime-4.5.3.jar;C:\Users\sorun\.m2\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-tags_2.11\2.2.0\spark-tags_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-column\1.8.2\parquet-column-1.8.2.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-common\1.8.2\parquet-common-1.8.2.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-encoding\1.8.2\parquet-encoding-1.8.2.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-hadoop\1.8.2\parquet-hadoop-1.8.2.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-format\2.3.1\parquet-format-2.3.1.jar;C:\Users\sorun\.m2\repository\org\apache\parquet\parquet-jackson\1.8.2\parquet-jackson-1.8.2.jar;C:\Users\sorun\.m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.11\jackson-mapper-asl-1.9.11.jar;C:\Users\sorun\.m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.11\jackson-core-asl-1.9.11.jar;C:\Users\sorun\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5.jar;C:\Users\sorun\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.0\jackson-annotations-2.6.0.jar;C:\Users\sorun\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.6.5\jackson-core-2.6.5.jar;C:\Users\sorun\.m2\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;C:\Users\sorun\.m2\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;C:\Users\sorun\.m2\repository\org\apache\spark\spark-sql-kafka-0-10_2.11\2.2.0\spark-sql-kafka-0-10_2.11-2.2.0.jar;C:\Users\sorun\.m2\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;C:\Users\sorun\.m2\repository\com\google\code\gson\gson\2.8.3\gson-2.8.3.jar StreamingConsumer Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 20/06/19 12:39:42 INFO SparkContext: Running Spark version 2.2.0 WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/C:/Users/sorun/.m2/repository/org/apache/hadoop/hadoop-auth/2.6.5/hadoop-auth-2.6.5.jar) to method sun.security.krb5.Config.getInstance() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 20/06/19 12:39:43 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/06/19 12:39:44 INFO SparkContext: Submitted application: Streaming-kafka 20/06/19 12:39:44 INFO SecurityManager: Changing view acls to: OZAN-OKAN 20/06/19 12:39:44 INFO SecurityManager: Changing modify acls to: OZAN-OKAN 20/06/19 12:39:44 INFO SecurityManager: Changing view acls groups to: 20/06/19 12:39:44 INFO SecurityManager: Changing modify acls groups to: 20/06/19 12:39:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(OZAN-OKAN); groups with view permissions: Set(); users with modify permissions: Set(OZAN-OKAN); groups with modify permissions: Set() 20/06/19 12:39:45 INFO Utils: Successfully started service 'sparkDriver' on port 50966. 20/06/19 12:39:45 INFO SparkEnv: Registering MapOutputTracker 20/06/19 12:39:45 INFO SparkEnv: Registering BlockManagerMaster 20/06/19 12:39:45 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 20/06/19 12:39:45 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 20/06/19 12:39:45 INFO DiskBlockManager: Created local directory at C:\Users\sorun\AppData\Local\Temp\blockmgr-0794380e-6e2b-4559-bf6c-7d10c2074bc8 20/06/19 12:39:45 INFO MemoryStore: MemoryStore started with capacity 1040.4 MB 20/06/19 12:39:45 INFO SparkEnv: Registering OutputCommitCoordinator 20/06/19 12:39:45 INFO Utils: Successfully started service 'SparkUI' on port 4040. 20/06/19 12:39:46 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.56.1:4040 20/06/19 12:39:46 INFO Executor: Starting executor ID driver on host localhost 20/06/19 12:39:46 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50975. 20/06/19 12:39:46 INFO NettyBlockTransferService: Server created on 192.168.56.1:50975 20/06/19 12:39:46 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 20/06/19 12:39:46 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.56.1, 50975, None) 20/06/19 12:39:46 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:50975 with 1040.4 MB RAM, BlockManagerId(driver, 192.168.56.1, 50975, None) 20/06/19 12:39:46 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.56.1, 50975, None) 20/06/19 12:39:46 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.56.1, 50975, None) 20/06/19 12:39:46 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/C:/Users/sorun/IdeaProjects/spark-streaming-kafka/spark-warehouse/'). 20/06/19 12:39:46 INFO SharedState: Warehouse path is 'file:/C:/Users/sorun/IdeaProjects/spark-streaming-kafka/spark-warehouse/'. 20/06/19 12:39:47 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 20/06/19 12:39:47 INFO CatalystSqlParser: Parsing command: string 20/06/19 12:39:49 INFO SparkSqlParser: Parsing command: CAST(value AS STRING) message Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`product`' given input columns: [jsontostructs(message)]; at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:88) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:85) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4$$anonfun$apply$10.apply(TreeNode.scala:323) at scala.collection.MapLike$MappedValues$$anonfun$iterator$3.apply(MapLike.scala:246) at scala.collection.MapLike$MappedValues$$anonfun$iterator$3.apply(MapLike.scala:246) at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.IterableLike$$anon$1.foreach(IterableLike.scala:311) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.MapBuilder.$plus$plus$eq(MapBuilder.scala:25) at scala.collection.TraversableViewLike$class.force(TraversableViewLike.scala:88) at scala.collection.IterableLike$$anon$1.force(IterableLike.scala:311) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:331) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:268) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$transformExpressionsUp$1.apply(QueryPlan.scala:268) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:279) at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:289) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$6.apply(QueryPlan.scala:298) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:298) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:268) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:85) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:78) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:78) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:91) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.resolveAndBind(ExpressionEncoder.scala:256) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:206) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:170) at org.apache.spark.sql.Dataset$.apply(Dataset.scala:61) at org.apache.spark.sql.Dataset.as(Dataset.scala:380) at StreamingConsumer.main(StreamingConsumer.java:24) 20/06/19 12:39:50 INFO SparkContext: Invoking stop() from shutdown hook 20/06/19 12:39:50 INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4040 20/06/19 12:39:50 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 20/06/19 12:39:50 INFO MemoryStore: MemoryStore cleared 20/06/19 12:39:50 INFO BlockManager: BlockManager stopped 20/06/19 12:39:50 INFO BlockManagerMaster: BlockManagerMaster stopped 20/06/19 12:39:50 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 20/06/19 12:39:50 INFO SparkContext: Successfully stopped SparkContext 20/06/19 12:39:50 INFO ShutdownHookManager: Shutdown hook called 20/06/19 12:39:50 INFO ShutdownHookManager: Deleting directory C:\Users\sorun\AppData\Local\Temp\spark-b70ecbcc-e6cf-4328-9069-97cc41cc72d7 Process finished with exit code 1 CODE
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`product`' given input columns: [jsontostructs(message)]; Above exception message says the column which you are selecting is not available in DataFrame, rename the column jsontostructs(message) to product & use this column in select.
And if you have "message" field in your model, add it to schema struct type StructType schema = new StructType().add("product","string").add("time", DataTypes.TimestampType).add("message", DataTypes.StringType);
Change schema).as("json")) Dataset<SearchProductModel> data = load.selectExpr("CAST(value AS STRING) as message") .select(functions.from_json(functions.col("message"), schema).as("json")) .select("json.*") .as(Encoders.bean(SearchProductModel.class));
SPARK Error: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.0.4.1-libsnappyjava [duplicate]
This question already has answers here: UnsatisfiedLinkError: no snappyjava in java.library.path when running Spark MLLib Unit test within Intellij (4 answers) UnsatisfiedLinkError: /tmp/snappy-1.1.4-libsnappyjava.so Error loading shared library ld-linux-x86-64.so.2: No such file or directory (8 answers) spark returns error libsnappyjava.so: failed to map segment from shared object: Operation not permitted (2 answers) Closed 3 years ago. I am running CDH 5.16 standalone singlenode in a RHEL 7 Server. I have written a simple spark code that reads a text file from HDFS and load it as parquet file in a separate location in HDFS. But when ever i am running this code in the server(i am using SBT to build jar and deploy it in cluster using spark-submit), following error is thrown: 19/06/07 12:56:04 INFO spark.SparkContext: Running Spark version 1.6.0 19/06/07 12:56:04 INFO spark.SecurityManager: Changing view acls to: ak_bng 19/06/07 12:56:04 INFO spark.SecurityManager: Changing modify acls to: ak_bng 19/06/07 12:56:04 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ak_bng); users with modify permissions: Set(ak_bng) 19/06/07 12:56:05 INFO util.Utils: Successfully started service 'sparkDriver' on port 44220. 19/06/07 12:56:05 INFO slf4j.Slf4jLogger: Slf4jLogger started 19/06/07 12:56:05 INFO Remoting: Starting remoting 19/06/07 12:56:05 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#10.188.223.5:36304] 19/06/07 12:56:05 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem#10.188.223.5:36304] 19/06/07 12:56:05 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 36304. 19/06/07 12:56:05 INFO spark.SparkEnv: Registering MapOutputTracker 19/06/07 12:56:05 INFO spark.SparkEnv: Registering BlockManagerMaster 19/06/07 12:56:05 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-c38a27e3-c483-4f56-ab7f-56e4c1be0832 19/06/07 12:56:05 INFO storage.MemoryStore: MemoryStore started with capacity 530.0 MB 19/06/07 12:56:06 INFO spark.SparkEnv: Registering OutputCommitCoordinator 19/06/07 12:56:06 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 19/06/07 12:56:06 INFO ui.SparkUI: Started SparkUI at http://10.188.223.5:4040 19/06/07 12:56:06 INFO spark.SparkContext: Added JAR file:/home/ak_bng/spark_jars/Simple_Project-assembly-1.0.jar at spark://10.188.223.5:44220/jars/Simple_Project-assembly-1.0.jar with timestamp 1559892366578 19/06/07 12:56:06 INFO executor.Executor: Starting executor ID driver on host localhost 19/06/07 12:56:06 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46170. 19/06/07 12:56:06 INFO netty.NettyBlockTransferService: Server created on 46170 19/06/07 12:56:06 INFO storage.BlockManager: external shuffle service port = 7337 19/06/07 12:56:06 INFO storage.BlockManagerMaster: Trying to register BlockManager 19/06/07 12:56:06 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:46170 with 530.0 MB RAM, BlockManagerId(driver, localhost, 46170) 19/06/07 12:56:06 INFO storage.BlockManagerMaster: Registered BlockManager 19/06/07 12:56:07 INFO scheduler.EventLoggingListener: Logging events to hdfs://indelsrv185.in.kworld.kpmg.com:8020/user/spark/applicationHistory/local-1559892366602 19/06/07 12:56:07 INFO spark.SparkContext: Registered listener com.cloudera.spark.lineage.ClouderaNavigatorListener 19/06/07 12:56:08 INFO parquet.ParquetRelation: Listing hdfs://10.188.223.5:8020/user/ak_bng/products on driver 19/06/07 12:56:08 INFO parquet.ParquetRelation: Listing hdfs://10.188.223.5:8020/user/ak_bng/categories on driver java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:312) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) at org.apache.spark.io.SnappyCompressionCodec$.liftedTree1$1(CompressionCodec.scala:169) at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version$lzycompute(CompressionCodec.scala:168) at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version(CompressionCodec.scala:168) at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72) at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:65) at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:74) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:81) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1334) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:126) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:48) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:46) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:53) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:53) at org.apache.spark.sql.execution.QueryExecution$$anonfun$toString$5.apply(QueryExecution.scala:81) at org.apache.spark.sql.execution.QueryExecution$$anonfun$toString$5.apply(QueryExecution.scala:81) at org.apache.spark.sql.execution.QueryExecution.stringOrError(QueryExecution.scala:61) at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:81) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:50) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:106) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:56) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:256) at org.apache.spark.sql.DataFrameWriter.dataSource$lzycompute$1(DataFrameWriter.scala:181) at org.apache.spark.sql.DataFrameWriter.org$apache$spark$sql$DataFrameWriter$$dataSource$1(DataFrameWriter.scala:181) at org.apache.spark.sql.DataFrameWriter$$anonfun$save$1.apply$mcV$sp(DataFrameWriter.scala:188) at org.apache.spark.sql.DataFrameWriter.executeAndCallQEListener(DataFrameWriter.scala:154) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:188) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:172) at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:370) at SimpleApp$.main(SimpleApp.scala:169) at SimpleApp.main(SimpleApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.0.4.1-libsnappyjava.so: /tmp/snappy-1.0.4.1-libsnappyjava.so: failed to map segment from shared object: Operation not permitted at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) at java.lang.Runtime.load0(Runtime.java:809) at java.lang.System.load(System.java:1086) at org.xerial.snappy.SnappyNativeLoader.load(SnappyNativeLoader.java:39) ... 65 more Exception in thread "main" java.lang.IllegalArgumentException: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.apache.spark.io.SnappyCompressionCodec$.liftedTree1$1(CompressionCodec.scala:171) at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version$lzycompute(CompressionCodec.scala:168) at org.apache.spark.io.SnappyCompressionCodec$.org$apache$spark$io$SnappyCompressionCodec$$version(CompressionCodec.scala:168) at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72) at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:65) at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:74) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:81) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1334) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:126) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:48) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:46) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:53) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:53) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:51) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:106) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:56) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:256) at org.apache.spark.sql.DataFrameWriter.dataSource$lzycompute$1(DataFrameWriter.scala:181) at org.apache.spark.sql.DataFrameWriter.org$apache$spark$sql$DataFrameWriter$$dataSource$1(DataFrameWriter.scala:181) at org.apache.spark.sql.DataFrameWriter$$anonfun$save$1.apply$mcV$sp(DataFrameWriter.scala:188) at org.apache.spark.sql.DataFrameWriter.executeAndCallQEListener(DataFrameWriter.scala:154) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:188) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:172) at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:370) at SimpleApp$.main(SimpleApp.scala:169) at SimpleApp.main(SimpleApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.apache.spark.io.SnappyCompressionCodec$.liftedTree1$1(CompressionCodec.scala:169) ... 53 more 19/06/07 12:56:08 INFO spark.SparkContext: Invoking stop() from shutdown hook 19/06/07 12:56:08 INFO ui.SparkUI: Stopped Spark web UI at http://10.188.223.5:4040 19/06/07 12:56:08 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 19/06/07 12:56:09 INFO storage.MemoryStore: MemoryStore cleared 19/06/07 12:56:09 INFO storage.BlockManager: BlockManager stopped 19/06/07 12:56:09 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 19/06/07 12:56:09 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 19/06/07 12:56:09 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 19/06/07 12:56:09 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 19/06/07 12:56:09 INFO Remoting: Remoting shut down 19/06/07 12:56:09 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 19/06/07 12:56:09 INFO spark.SparkContext: Successfully stopped SparkContext 19/06/07 12:56:09 INFO util.ShutdownHookManager: Shutdown hook called 19/06/07 12:56:09 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-111712ef-39a8-41b6-bf6d-5d317d954fa1 spark submit command: spark-submit --class SimpleApp --master local[8] /home/ak_bng/spark_jars/Simple_Project-assembly-1.0.jar. I went through few links to resolve this issue(Snappy Compression not working due to tmp folder previliges, Apache Spark - Parquet / Snappy compression error, ) but none couldn't really provide a solution for this. I had run Spark on HDFS (separate installation) successfully without any errors before. The problem started coming once CDHwas installed. I am new to setting up cluster and quite don't understand what the issue is here and how to resolve it. Can some one help please shed some light on this. I am using: CDH 5.16 Spark 1.6.0 Server OS: RHEL 7 Hadoop 2.6
Spark Standalone on Kubernetes - application got finished after consecutive master then driver failure
Trying to achieve High Availability of SparkMaster using ZooKeeper with SparkDriver resiliency using metaData checkpoint into GlusterFS. Some Informations : Using Spark 2.2.0 (prebuilt binary) Submitting a streaming app with --deploy-mode cluster and --supervise from a separate spark client pod Spark Components on Kubernetes are of type Statefulset for Dynamic Volume Provisioning (Previously using Replication Controller/ Deployment) Created 3 GlusterFS shared pvc - spark-master-pvc,spark-worker-pvc,spark-ckp-pvc Successfully achieved the Scenarios like - Only Master Failure, Only Driver Failure, Consecutive Master and Driver Failure, Driver Failure then Master. But the Scenario like Submitted a Job -> Master Failure (Working fine) -> Driver Failure i.e. Worker Pod failure is not working. NEW ALIVE MASTER's log - 18/06/11 10:23:16 INFO ZooKeeperLeaderElectionAgent: We have gained leadership 18/06/11 10:23:16 INFO Master: I have been elected leader! New state: RECOVERING 18/06/11 10:23:16 INFO Master: Trying to recover app: app-20180611102123-0001 18/06/11 10:23:16 INFO Master: Trying to recover worker: worker-20180611101834-10.1.53.142-36203 18/06/11 10:23:16 INFO Master: Trying to recover worker: worker-20180611102123-10.1.170.85-39447 18/06/11 10:23:16 INFO Master: Trying to recover worker: worker-20180611101834-10.1.185.87-38235 18/06/11 10:23:16 INFO TransportClientFactory: Successfully created connection to /10.1.53.142:36203 after 7 ms (0 ms spent in bootstraps) 18/06/11 10:23:16 INFO TransportClientFactory: Successfully created connection to /10.1.185.87:38235 after 3 ms (0 ms spent in bootstraps) 18/06/11 10:23:16 INFO TransportClientFactory: Successfully created connection to /10.1.53.142:38994 after 12 ms (0 ms spent in bootstraps) 18/06/11 10:23:16 INFO TransportClientFactory: Successfully created connection to /10.1.170.85:39447 after 7 ms (0 ms spent in bootstraps) 18/06/11 10:23:16 INFO Master: Application has been re-registered: app-20180611102123-0001 18/06/11 10:23:16 INFO Master: Worker has been re-registered: worker-20180611102123-10.1.170.85-39447 18/06/11 10:23:16 INFO Master: Worker has been re-registered: worker-20180611101834-10.1.53.142-36203 18/06/11 10:23:16 INFO Master: Worker has been re-registered: worker-20180611101834-10.1.185.87-38235 18/06/11 10:23:16 INFO Master: Recovery complete - resuming operations! 18/06/11 10:24:37 INFO Master: Received unregister request from application app-20180611102123-0001 18/06/11 10:24:37 INFO Master: Removing app app-20180611102123-0001 18/06/11 10:24:37 INFO Master: 10.1.53.142:38994 got disassociated, removing it. 18/06/11 10:24:37 INFO Master: 10.1.53.142:38994 got disassociated, removing it. 18/06/11 10:24:37 WARN Master: Got status update for unknown executor app-20180611102123-0001/0 18/06/11 10:24:37 WARN Master: Got status update for unknown executor app-20180611102123-0001/1 18/06/11 10:24:38 INFO Master: 10.1.53.142:36203 got disassociated, removing it. 18/06/11 10:24:38 INFO Master: Removing worker worker-20180611101834-10.1.53.142-36203 on 10.1.53.142:36203 18/06/11 10:24:38 INFO Master: Re-launching driver-20180611102017-0000 18/06/11 10:24:38 INFO Master: Launching driver driver-20180611102017-0000 on worker worker-20180611101834-10.1.185.87-38235 18/06/11 10:24:38 INFO Master: 10.1.53.142:59142 got disassociated, removing it. 18/06/11 10:24:38 INFO Master: 10.1.53.142:36203 got disassociated, removing it. 18/06/11 10:24:38 INFO Master: 10.1.53.142:36203 got disassociated, removing it. 18/06/11 10:24:43 INFO Master: Registering worker 10.1.53.143:35156 with 8 cores, 30.3 GB RAM DRIVER is remained in Halted State. Driver Error Log - log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 18/06/11 19:32:14 INFO SecurityManager: Changing view acls to: root 18/06/11 19:32:14 INFO SecurityManager: Changing modify acls to: root 18/06/11 19:32:14 INFO SecurityManager: Changing view acls groups to: 18/06/11 19:32:14 INFO SecurityManager: Changing modify acls groups to: 18/06/11 19:32:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 18/06/11 19:32:15 INFO Utils: Successfully started service 'Driver' on port 40594. 18/06/11 19:32:15 INFO WorkerWatcher: Connecting to worker spark://Worker#10.1.185.87:38235 18/06/11 19:32:15 INFO TransportClientFactory: Successfully created connection to /10.1.185.87:38235 after 44 ms (0 ms spent in bootstraps) 18/06/11 19:32:15 INFO WorkerWatcher: Successfully connected to spark://Worker#10.1.185.87:38235 18/06/11 19:32:15 INFO CheckpointReader: Checkpoint files found: file:/ckp/checkpoint-1528712675000,file:/ckp/checkpoint-1528712675000.bk,file:/ckp/checkpoint-1528712670000,file:/ckp/checkpoint-1528712670000.bk,file:/ckp/checkpoint-1528712665000,file:/ckp/checkpoint-1528712665000.bk,file:/ckp/checkpoint-1528712660000,file:/ckp/checkpoint-1528712660000.bk,file:/ckp/checkpoint-1528712655000,file:/ckp/checkpoint-1528712655000.bk 18/06/11 19:32:15 INFO CheckpointReader: Attempting to load checkpoint from file file:/ckp/checkpoint-1528712675000 18/06/11 19:32:15 INFO Checkpoint: Checkpoint for time 1528712675000 ms validated 18/06/11 19:32:15 INFO CheckpointReader: Checkpoint successfully loaded from file file:/ckp/checkpoint-1528712675000 18/06/11 19:32:15 INFO CheckpointReader: Checkpoint was generated at time 1528712675000 ms 18/06/11 19:32:15 INFO SparkContext: Running Spark version 2.2.0 18/06/11 19:32:15 INFO SparkContext: Submitted application: SparkStreamingWithCheckPointAndZK 18/06/11 19:32:15 INFO SecurityManager: Changing view acls to: root 18/06/11 19:32:15 INFO SecurityManager: Changing modify acls to: root 18/06/11 19:32:15 INFO SecurityManager: Changing view acls groups to: 18/06/11 19:32:15 INFO SecurityManager: Changing modify acls groups to: 18/06/11 19:32:15 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 18/06/11 19:32:15 INFO Utils: Successfully started service 'sparkDriver' on port 46544. 18/06/11 19:32:15 INFO SparkEnv: Registering MapOutputTracker 18/06/11 19:32:15 INFO SparkEnv: Registering BlockManagerMaster 18/06/11 19:32:15 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 18/06/11 19:32:15 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 18/06/11 19:32:16 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-623c4b9e-8045-4a19-a746-96a3b23c1184 18/06/11 19:32:16 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 18/06/11 19:32:16 INFO SparkEnv: Registering OutputCommitCoordinator 18/06/11 19:32:16 INFO Utils: Successfully started service 'SparkUI' on port 4040. 18/06/11 19:32:16 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.1.185.87:4040 18/06/11 19:32:16 INFO SparkContext: Added JAR file:///opt/spark/jars/spark-0.0.1-SNAPSHOT.jar at spark://10.1.185.87:46544/jars/spark-0.0.1-SNAPSHOT.jar with timestamp 1528745536460 18/06/11 19:32:16 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://10.1.170.81:7077... 18/06/11 19:32:36 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://10.1.170.81:7077... 18/06/11 19:32:56 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://10.1.170.81:7077... 18/06/11 19:33:16 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. 18/06/11 19:33:16 WARN StandaloneSchedulerBackend: Application ID is not initialized yet. 18/06/11 19:33:16 INFO SparkUI: Stopped Spark web UI at http://10.1.185.87:4040 18/06/11 19:33:16 INFO StandaloneSchedulerBackend: Shutting down all executors 18/06/11 19:33:16 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46323. 18/06/11 19:33:16 INFO NettyBlockTransferService: Server created on 10.1.185.87:46323 18/06/11 19:33:16 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 18/06/11 19:33:16 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 18/06/11 19:33:16 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.1.185.87, 46323, None) 18/06/11 19:33:16 WARN StandaloneAppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master 18/06/11 19:33:16 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.185.87:46323 with 366.3 MB RAM, BlockManagerId(driver, 10.1.185.87, 46323, None) 18/06/11 19:33:16 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.1.185.87, 46323, None) 18/06/11 19:33:16 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.1.185.87, 46323, None) 18/06/11 19:33:16 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 18/06/11 19:33:16 INFO MemoryStore: MemoryStore cleared 18/06/11 19:33:16 INFO BlockManager: BlockManager stopped 18/06/11 19:33:16 INFO BlockManagerMaster: BlockManagerMaster stopped 18/06/11 19:33:16 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 18/06/11 19:33:16 ERROR SparkContext: Error initializing SparkContext. java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem at scala.Predef$.require(Predef.scala:224) at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91) at org.apache.spark.SparkContext.<init>(SparkContext.scala:524) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:141) at apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:829) at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:829) at scala.Option.map(Option.scala:146) at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:829) at org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626) at org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala) at org.merlin.spark.SparkKafkaStreamingWithGluster.main(SparkKafkaStreamingWithGluster.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) 18/06/11 19:33:16 INFO SparkContext: SparkContext already stopped. Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.Predef$.require(Predef.scala:224) at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91) at org.apache.spark.SparkContext.<init>(SparkContext.scala:524) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:141) at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:829) at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:829) at scala.Option.map(Option.scala:146) at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:829) at org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626) at org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala) at org.merlin.spark.SparkKafkaStreamingWithGluster.main(SparkKafkaStreamingWithGluster.java:42) ... 6 more Am I choosing the right resource controller i.e. Statefulsets of kubernetes for spark? M new to this environment, any help will be highly appreciable.
Seems like your driver is not able to find master node. Here is the log 18/06/11 19:33:16 ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. Try to telnet ip and port from your client machine.