I have set up Datastax Enterprice in three nodes in the local network.
Two nodes are debian servers and i used apt package manager for installation. The last node is iMac and i used the .dmg package for installation.
Node #1:
OS: Debian GNU/Linux 8.10 (jessie)
Local IP: 172.16.21.18
Datastax Enterprice: 5.1.7
Node #2:
OS: Ubuntu 16.04.3 LTS
Local IP: 172.16.21.25
Datastax Enterprice: 5.1.7
Node #1:
OS: macOS 10.13.2
Local IP: 192.168.1.108
Datastax Enterprice: 5.1.7
Nodes are up and running in analytics and search mode: ($ dse cassandra -k -s)
Now, I'm trying to connect on Spark Cluster using Apache Zeppelin 0.7.3. Apache Zeppelin is installed and configured in Node #1.
I followed these instructions for configuration. Below you can see some basic changes in config files:
zeppelin-0.7.3-bin-all/conf/zeppelin-env.sh
[..]
export MASTER=spark://172.16.21.18:7077 # Spark master url. eg. spark://master_addr:7077. Leave empty if you want to use local mode.
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export DSE_HOME=/usr
[..]
zeppelin-0.7.3-bin-all/bin/interpreter.sh
[..]
# set spark related env variables
if [[ "${INTERPRETER_ID}" == "spark" ]]; then
if [[ -n "${SPARK_HOME}" ]]; then
export SPARK_SUBMIT="${DSE_HOME}/bin/dse spark-submit"
[..]
Zeppelin Spark Intepreter:
Zeppelin CQL intepreter works perfect with Apache Cassandra but then i'm trying to use Spark Intepreter to execute some queries i'm getting this error:
%spark
val results = spark.sql("SELECT * from keyspace.table")
java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
[..]
complete zeppelin log file:
INFO [2018-02-21 04:25:36,185] ({Thread-0} RemoteInterpreterServer.java[run]:97) - Starting remote interpreter server on port 52127
INFO [2018-02-21 04:25:36,562] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkInterpreter
INFO [2018-02-21 04:25:36,589] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkSqlInterpreter
INFO [2018-02-21 04:25:36,601] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.DepInterpreter
INFO [2018-02-21 04:25:36,619] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.PySparkInterpreter
INFO [2018-02-21 04:25:36,622] ({pool-1-thread-3} RemoteInterpreterServer.java[createInterpreter]:198) - Instantiate interpreter org.apache.zeppelin.spark.SparkRInterpreter
INFO [2018-02-21 04:25:36,683] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:131) - Job remoteInterpretJob_1519205136682 started by scheduler org.apache.zeppelin.spark.SparkInterpreter269729544
INFO [2018-02-21 04:25:40,733] ({pool-2-thread-2} SparkInterpreter.java[createSparkSession]:318) - ------ Create new SparkContext spark://172.16.21.18:7077 -------
WARN [2018-02-21 04:25:40,740] ({pool-2-thread-2} SparkInterpreter.java[setupConfForSparkR]:577) - sparkr.zip is not found, sparkr may not work.
INFO [2018-02-21 04:25:40,786] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Running Spark version 2.1.0
WARN [2018-02-21 04:25:41,760] ({pool-2-thread-2} NativeCodeLoader.java[<clinit>]:62) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARN [2018-02-21 04:25:41,958] ({pool-2-thread-2} Logging.scala[logWarning]:66) -
SPARK_CLASSPATH was detected (set to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
WARN [2018-02-21 04:25:41,959] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Setting 'spark.executor.extraClassPath' to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:' as a work-around.
WARN [2018-02-21 04:25:41,960] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Setting 'spark.driver.extraClassPath' to ':/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/dep/*:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/*:/home/cassandra/zeppelin-0.7.3-bin-all/lib/interpreter/*:' as a work-around.
WARN [2018-02-21 04:25:41,986] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Your hostname, XPLAIN005 resolves to a loopback address: 127.0.1.1; using 172.16.21.18 instead (on interface eth0)
WARN [2018-02-21 04:25:41,987] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Set SPARK_LOCAL_IP if you need to bind to another address
INFO [2018-02-21 04:25:42,017] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing view acls to: cassandra
INFO [2018-02-21 04:25:42,017] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing modify acls to: cassandra
INFO [2018-02-21 04:25:42,018] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing view acls groups to:
INFO [2018-02-21 04:25:42,019] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Changing modify acls groups to:
INFO [2018-02-21 04:25:42,019] ({pool-2-thread-2} Logging.scala[logInfo]:54) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cassandra); groups with view permissions: Set(); users with modify permissions: Set(cassandra); groups with modify permissions: Set()
INFO [2018-02-21 04:25:42,417] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'sparkDriver' on port 51240.
INFO [2018-02-21 04:25:42,445] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering MapOutputTracker
INFO [2018-02-21 04:25:42,476] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering BlockManagerMaster
INFO [2018-02-21 04:25:42,481] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
INFO [2018-02-21 04:25:42,482] ({pool-2-thread-2} Logging.scala[logInfo]:54) - BlockManagerMasterEndpoint up
INFO [2018-02-21 04:25:42,507] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Created local directory at /tmp/blockmgr-797ea400-69f1-4228-a6da-fe424edce8d4
INFO [2018-02-21 04:25:42,524] ({pool-2-thread-2} Logging.scala[logInfo]:54) - MemoryStore started with capacity 408.9 MB
INFO [2018-02-21 04:25:42,591] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering OutputCommitCoordinator
INFO [2018-02-21 04:25:42,700] ({pool-2-thread-2} Log.java[initialized]:186) - Logging initialized #6930ms
INFO [2018-02-21 04:25:42,864] ({pool-2-thread-2} Server.java[doStart]:327) - jetty-9.2.z-SNAPSHOT
INFO [2018-02-21 04:25:42,902] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2cbd702d{/jobs,null,AVAILABLE}
INFO [2018-02-21 04:25:42,903] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#240b993c{/jobs/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,903] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5b7d8292{/jobs/job,null,AVAILABLE}
INFO [2018-02-21 04:25:42,908] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#4c2353ff{/jobs/job/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,909] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#bd87e4e{/stages,null,AVAILABLE}
INFO [2018-02-21 04:25:42,910] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#73e2d470{/stages/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,917] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#44bca18c{/stages/stage,null,AVAILABLE}
INFO [2018-02-21 04:25:42,918] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#1256be4f{/stages/stage/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,919] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5a349845{/stages/pool,null,AVAILABLE}
INFO [2018-02-21 04:25:42,919] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#3f108627{/stages/pool/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,926] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#1e01f088{/storage,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#390281c1{/storage/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#470ac014{/storage/rdd,null,AVAILABLE}
INFO [2018-02-21 04:25:42,927] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#7c90476c{/storage/rdd/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,928] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#6d847dc6{/environment,null,AVAILABLE}
INFO [2018-02-21 04:25:42,936] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#40a5e53e{/environment/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,937] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#513e975e{/executors,null,AVAILABLE}
INFO [2018-02-21 04:25:42,937] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2f6b1132{/executors/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,938] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#61cf2354{/executors/threadDump,null,AVAILABLE}
INFO [2018-02-21 04:25:42,939] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#eacb646{/executors/threadDump/json,null,AVAILABLE}
INFO [2018-02-21 04:25:42,951] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2b8d44aa{/static,null,AVAILABLE}
INFO [2018-02-21 04:25:42,953] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#5c982268{/,null,AVAILABLE}
INFO [2018-02-21 04:25:42,954] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#44556f2c{/api,null,AVAILABLE}
INFO [2018-02-21 04:25:42,955] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#2fa0ef66{/jobs/job/kill,null,AVAILABLE}
INFO [2018-02-21 04:25:42,955] ({pool-2-thread-2} ContextHandler.java[doStart]:744) - Started o.s.j.s.ServletContextHandler#6e49562c{/stages/stage/kill,null,AVAILABLE}
INFO [2018-02-21 04:25:42,970] ({pool-2-thread-2} AbstractConnector.java[doStart]:266) - Started ServerConnector#53405611{HTTP/1.1}{0.0.0.0:4040}
INFO [2018-02-21 04:25:42,971] ({pool-2-thread-2} Server.java[doStart]:379) - Started #7201ms
INFO [2018-02-21 04:25:42,971] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'SparkUI' on port 4040.
INFO [2018-02-21 04:25:42,974] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Bound SparkUI to 0.0.0.0, and started at http://172.16.21.18:4040
INFO [2018-02-21 04:25:43,214] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Added file file:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/pyspark.zip at spark://172.16.21.18:51240/files/pyspark.zip with timestamp 1519205143214
INFO [2018-02-21 04:25:43,217] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Copying /home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/pyspark.zip to /tmp/spark-2e9292e3-8c4d-445a-92f0-7d54188818db/userFiles-4e8301a5-91bc-4753-8436-6cced0bdc5c5/pyspark.zip
INFO [2018-02-21 04:25:43,226] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Added file file:/home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/py4j-0.10.4-src.zip at spark://172.16.21.18:51240/files/py4j-0.10.4-src.zip with timestamp 1519205143226
INFO [2018-02-21 04:25:43,227] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Copying /home/cassandra/zeppelin-0.7.3-bin-all/interpreter/spark/pyspark/py4j-0.10.4-src.zip to /tmp/spark-2e9292e3-8c4d-445a-92f0-7d54188818db/userFiles-4e8301a5-91bc-4753-8436-6cced0bdc5c5/py4j-0.10.4-src.zip
INFO [2018-02-21 04:25:43,279] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Created default pool default, schedulingMode: FIFO, minShare: 0, weight: 1
INFO [2018-02-21 04:25:43,325] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
INFO [2018-02-21 04:25:43,391] ({netty-rpc-connection-0} TransportClientFactory.java[createClient]:250) - Successfully created connection to /172.16.21.18:7077 after 33 ms (0 ms spent in bootstraps)
INFO [2018-02-21 04:26:03,326] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
INFO [2018-02-21 04:26:23,326] ({appclient-register-master-threadpool-0} Logging.scala[logInfo]:54) - Connecting to master spark://172.16.21.18:7077...
ERROR [2018-02-21 04:26:43,328] ({appclient-registration-retry-thread} Logging.scala[logError]:70) - Application has been killed. Reason: All masters are unresponsive! Giving up.
WARN [2018-02-21 04:26:43,328] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Application ID is not initialized yet.
INFO [2018-02-21 04:26:43,336] ({stop-spark-context} AbstractConnector.java[doStop]:306) - Stopped ServerConnector#53405611{HTTP/1.1}{0.0.0.0:4040}
INFO [2018-02-21 04:26:43,339] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40068.
INFO [2018-02-21 04:26:43,498] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Server created on 172.16.21.18:40068
INFO [2018-02-21 04:26:43,499] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#6e49562c{/stages/stage/kill,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,500] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2fa0ef66{/jobs/job/kill,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,501] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#44556f2c{/api,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,501] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5c982268{/,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,505] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2b8d44aa{/static,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,506] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#eacb646{/executors/threadDump/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,507] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#61cf2354{/executors/threadDump,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,508] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
INFO [2018-02-21 04:26:43,508] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2f6b1132{/executors/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,509] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#513e975e{/executors,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,510] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#40a5e53e{/environment/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,511] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#6d847dc6{/environment,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,511] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#7c90476c{/storage/rdd/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,512] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#470ac014{/storage/rdd,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#390281c1{/storage/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#1e01f088{/storage,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,513] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#3f108627{/stages/pool/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,514] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registering BlockManager BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,514] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5a349845{/stages/pool,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,515] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#1256be4f{/stages/stage/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,515] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#44bca18c{/stages/stage,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,516] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#73e2d470{/stages/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,516] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#bd87e4e{/stages,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,517] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#4c2353ff{/jobs/job/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,517] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#5b7d8292{/jobs/job,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,518] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#240b993c{/jobs/json,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,518] ({stop-spark-context} ContextHandler.java[doStop]:865) - Stopped o.s.j.s.ServletContextHandler#2cbd702d{/jobs,null,UNAVAILABLE}
INFO [2018-02-21 04:26:43,521] ({dispatcher-event-loop-0} Logging.scala[logInfo]:54) - Registering block manager 172.16.21.18:40068 with 408.9 MB RAM, BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,522] ({stop-spark-context} Logging.scala[logInfo]:54) - Stopped Spark web UI at http://172.16.21.18:4040
INFO [2018-02-21 04:26:43,526] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Registered BlockManager BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,527] ({pool-2-thread-2} Logging.scala[logInfo]:54) - Initialized BlockManager: BlockManagerId(driver, 172.16.21.18, 40068, None)
INFO [2018-02-21 04:26:43,530] ({stop-spark-context} Logging.scala[logInfo]:54) - Shutting down all executors
INFO [2018-02-21 04:26:43,546] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - Asking each executor to shut down
WARN [2018-02-21 04:26:43,561] ({dispatcher-event-loop-0} Logging.scala[logWarning]:66) - Drop UnregisterApplication(null) because has not yet connected to master
INFO [2018-02-21 04:26:43,583] ({dispatcher-event-loop-2} Logging.scala[logInfo]:54) - MapOutputTrackerMasterEndpoint stopped!
INFO [2018-02-21 04:26:43,596] ({stop-spark-context} Logging.scala[logInfo]:54) - MemoryStore cleared
INFO [2018-02-21 04:26:43,597] ({stop-spark-context} Logging.scala[logInfo]:54) - BlockManager stopped
INFO [2018-02-21 04:26:43,605] ({stop-spark-context} Logging.scala[logInfo]:54) - BlockManagerMaster stopped
INFO [2018-02-21 04:26:43,608] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - OutputCommitCoordinator stopped!
ERROR [2018-02-21 04:26:43,748] ({pool-2-thread-2} Logging.scala[logError]:91) - Error initializing SparkContext.
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:524)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:378)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:233)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:841)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-02-21 04:26:43,751] ({pool-2-thread-2} Logging.scala[logInfo]:54) - SparkContext already stopped.
ERROR [2018-02-21 04:26:43,751] ({pool-2-thread-2} Utils.java[invokeMethod]:40) -
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:378)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:233)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:841)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:524)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
... 20 more
INFO [2018-02-21 04:26:43,752] ({stop-spark-context} Logging.scala[logInfo]:54) - Successfully stopped SparkContext
INFO [2018-02-21 04:26:43,752] ({pool-2-thread-2} SparkInterpreter.java[createSparkSession]:379) - Created Spark session
ERROR [2018-02-21 04:26:43,753] ({pool-2-thread-2} Job.java[run]:181) - Job failed
java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
INFO [2018-02-21 04:26:43,759] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1519205136682 finished by scheduler org.apache.zeppelin.spark.SparkInterpreter269729544
What do you think?
UPDATE:
All nodes upgraded to Datastax Enterprice 5.1.7
With DSE 5.1 any reference to the Spark Master should look like this example:
export MASTER=dse://1.20.300.10
ERROR [2018-02-21 04:26:43,328] ({appclient-registration-retry-thread} Logging.scala[logError]:70) - Application has been killed. Reason: All masters are unresponsive! Giving up.
WARN [2018-02-21 04:26:43,328] ({pool-2-thread-2} Logging.scala[logWarning]:66) - Application ID is not initialized yet.
It seems the app is killed. Could you check the logs in spark master ?
16/04/26 16:58:46 DEBUG ProtobufRpcEngine: Call: complete took 3ms
Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/CassandraJavaUtil
at com.baitic.mcava.lecturahdfssaveincassandra.TratamientoCSV.main(TratamientoCSV.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.CassandraJavaUtil
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 10 more
16/04/26 16:58:46 INFO SparkContext: Invoking stop() from shutdown hook
16/04/26 16:58:46 INFO SparkUI: Stopped Spark web UI at http://10.128.0.5:4040
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/04/26 16:58:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/04/26 16:58:46 INFO MemoryStore: MemoryStore cleared
16/04/26 16:58:46 INFO BlockManager: BlockManager stopped
16/04/26 16:58:46 INFO BlockManagerMaster: BlockManagerMaster stopped
16/04/26 16:58:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/04/26 16:58:46 INFO SparkContext: Successfully stopped SparkContext
16/04/26 16:58:46 INFO ShutdownHookManager: Shutdown hook called
16/04/26 16:58:46 INFO ShutdownHookManager: Deleting directory /srv/spark/tmp/spark-2bf57fa2-a2d5-4f8a-980c-994e56b61c44
16/04/26 16:58:46 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client#3fb9a67f
16/04/26 16:58:46 DEBUG Client: removing client from cache: org.apache.hadoop.ipc.Client#3fb9a67f
16/04/26 16:58:46 DEBUG Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client#3fb9a67f
16/04/26 16:58:46 DEBUG Client: Stopping client
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: closed
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: stopped, remaining connections 0
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
i make this simple code:
/ String pathDatosTratados="hdfs://mcava-master:54310/srv/hadoop/data/spark/DatosApp/medidasSensorTratadas.txt";
String jarPath ="hdfs://mcava-master:54310/srv/hadoop/data/spark/original-LecturaHDFSsaveInCassandra-1.0-SNAPSHOT.jar";
String jar="hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-assembly-1.6.0-M1-4-g6f01cfe.jar";
String jar2="hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar";
String[] jars= new String[3];
jars[0]=jarPath;
jars[2]=jar;
jars[1]=jar2;
SparkConf conf=new SparkConf().setAppName("TratamientoCSV").setJars(jars);
conf.set("spark.cassandra.connection.host", "10.128.0.5");
conf.set("spark.kryoserializer.buffer.max","512");
conf.set("spark.kryoserializer.buffer","256");
// conf.setJars(jars);
JavaSparkContext sc= new JavaSparkContext(conf);
JavaRDD<String> input= sc.textFile(pathDatos);
i also put the path to cassandra drive in spark-default.conf
spark.driver.extraClassPath hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar
spark.executor.extraClassPath hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar
i also put the flag --jars to the path of driver but i have always the same error i do not understand why??
i work in google engine
Try to add package when you submit your app.
$SPARK_HOME/bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-M2-s_2.11 ....
I add this argument to solve this problem: --packages datastax:spark-cassandra-connector:1.6.0-M2-s_2.10.
At least for 3.0+ spark cassandra connector, the official assembly jar works well for me. It has all the necessary dependencies.
i solve the problem... i maked a fat jar with all dependencies and it not necessary to indicate the references to the cassandra connector only the reference to the fat jar.
I used Spark in my Java programm, and had the same issue.
The problem was, because I didn`t include spark-cassandra-connector into my maven dependencies of my project.
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.7</version> <!-- Check actual version in maven repo -->
</dependency>
After that I builld fat jar with all my dependencies - and it`s worked!
Maybe it will help someone