datastax - Failed to connect to DSE resource manager on spark-submit

datastax - Failed to connect to DSE resource manager on spark-submit - apache-spark

dsetool status
DC: dc1 Workload: Cassandra Graph: no
======================================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns VNodes Rack Health [0,1]
UN 192.168.1.130 810.47 MiB ? 256 2a 0.90
UN 192.168.1.131 683.53 MiB ? 256 2a 0.90
UN 192.168.1.132 821.33 MiB ? 256 2a 0.90
DC: dc2 Workload: Analytics Graph: no Analytics Master: 192.168.2.131
=========================================================================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns VNodes Rack Health [0,1]
UN 192.168.2.130 667.05 MiB ? 256 2a 0.90
UN 192.168.2.131 845.48 MiB ? 256 2a 0.90
UN 192.168.2.132 887.92 MiB ? 256 2a 0.90
when I try to launch the spark-submit job
dse -u user -p password spark-submit --class com.sparkLauncher test.jar prf
i am getting the following error (edited)
ERROR 2017-09-14 20:14:14,174 org.apache.spark.deploy.rm.DseAppClient$ClientEndpoint: Failed to connect to DSE resource manager
java.io.IOException: Failed to register with master: dse://?
....
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: The method DseResourceManager.registerApplication does not exist. Make sure that the required component for that method is active/enabled
....
ERROR 2017-09-14 20:14:14,179 org.apache.spark.deploy.rm.DseSchedulerBackend: Application has been killed. Reason: Failed to connect to DSE resource manager: Failed to register with master: dse://?
org.apache.spark.SparkException: Exiting due to error from cluster scheduler: Failed to connect to DSE resource manager: Failed to register with master: dse://?
....
WARN 2017-09-14 20:14:14,179 org.apache.spark.deploy.rm.DseSchedulerBackend: Application ID is not initialized yet.
ERROR 2017-09-14 20:14:14,384 org.apache.spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
ERROR 2017-09-14 20:14:14,387 org.apache.spark.deploy.DseSparkSubmitBootstrapper: Failed to start or submit Spark application
java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem
I can confirm that I have granted privileges as mentioned in this documentation, https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/security/secAuthSpark.html
I am trying this on AWS if that makes a difference and I can confirm that the routes between the nodes are all open.
I am able to start spark shell from any of the spark nodes, can bring up the Spark UI, can get spark master from cqlsh commands
Any pointers will be helpful, thanks in advance!

The master address must point to one or more nodes in a valid Analytics enabled datacenter.
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException:
The method DseResourceManager.registerApplication does not exist.
Make sure that the required component for that method is active/enabled```
Indicates that the connected node was not analytics enabled.
If you run from a non analytics node you must still point at one of the analytics nodes in the master ui.
dse://[Spark node address[:port number]]?[parameter name=parameter value;]...
By default the dse://? url connects to localhost for it's initial cluster connection.
See the documentation for more information.

For some reason I am unable to pin point, I can run it as mentioned in cluster mode but not in client mode

Related

Best approach to remove cassandra-topology.properties file in running cluster nodes

There is 3 node cassandra cluster running and which is serving Production Traffic And in cassandra.yaml file "endpoint_snitch: GossipingPropertyFileSnitch" is configured but somehow we have forgot to remove file cassandra-topology.properties from cassandra conf directory. As per Cassandra documentation if you are using GossipingPropertyFileSnitch you should remove cassandra-topology.properties file.
Now As all three nodes are running and serving Production traffic So can I remove this file all three nodes or I have to remove this file after shutdown the nodes one by one.
Apache Cassandra Version is "3.11.2"
./bin/nodetool status
Datacenter: dc1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN x.x.x.x1 409.39 GiB 256 62.9% cshdkd-6065-4813-ae53-sdh89hs98so RAC1
UN x.x.x.x2 546.33 GiB 256 67.8% jfdsdk-f18f-4d46-af95-33jw9yhfcsd RAC2
UN x.x.x.x3 594.73 GiB 256 69.3% 7s9skk-a27f-4875-a410-sdsiudw9eww RAC3

If the cluster is already migrated to GossippingPropertyFileSnitch, then you can safely remove that file without stopping the cluster nodes. See the item 7 in DSE 5.1 documentation (compatible with Cassandra 3.11)

Exception encountered during startup: Unknown commitlog version 6

Cannot start cassandra in DataStax Enterprise (single node scenario)
runing $ sudo service dse start I get:
Starting DSE daemon : dse
DSE daemon starting with just Cassandra enabled (edit /etc/default/dse to enable)
running $ nodetool status
Failed to connect to '127.0.0.1:7199': Connection refused (Connection refused)
from output.log:
ERROR 08:03:24,900 Exception encountered during startup
java.lang.IllegalStateException: Unknown commitlog version 6
at org.apache.cassandra.db.commitlog.CommitLogDescriptor.getMessagingVersion(CommitLogDescriptor.java:80)
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:182)
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:95)
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151)
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:131)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:336)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:394)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:574)
java.lang.IllegalStateException: Unknown commitlog version 6
at org.apache.cassandra.db.commitlog.CommitLogDescriptor.getMessagingVersion(CommitLogDescriptor.java:80)
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:182)
at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:95)
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151)
at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:131)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:336)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:394)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:574)
Exception encountered during startup: Unknown commitlog version 6
INFO 08:03:24,915 DSE shutting down...
What can I do ? any suggestion are wellcome

Datastax Spark worker is always looking for master at 127.0.0.1

I am trying to bring up datastax cassandra in analytics mode by using "dse cassandra -k -s". I am using DSE 5.0 sandbox on a single node setup.
I have configured the spark-env.sh with SPARK_MASTER_IP as well as SPARK_LOCAL_IP to point to my LAN IP.
export SPARK_LOCAL_IP="172.40.9.79"
export SPARK_MASTER_HOST="172.40.9.79"
export SPARK_WORKER_HOST="172.40.9.79"
export SPARK_MASTER_IP="172.40.9.79"
All above variables are setup in spark-env.sh.
Despite these, the worker will not come up. It is always looking for a master at 127.0.0.1.This is the error i am seeing in /var/log/cassandra/system.log
WARN [worker-register-master-threadpool-8] 2016-10-04 08:02:45,832 SPARK-WORKER Logging.scala:91 - Failed to connect to master 127.0.0.1:7077
java.io.IOException: Failed to connect to /127.0.0.1:7077
Result from dse client-tool shows 127.0.0.1
$ dse client-tool -u cassandra -p cassandra spark master-address
spark://127.0.0.1:7077
However i am able to access the spark web UI from the LAN IP 172.40.9.79
Spark Web UI screenshot
Any help is greatly appreciated

Try add in file spark-defaults.conf this parameter: spark.master local[*] or spark.master 172.40.9.79. Maybe this solves your problem

spark-submit cluster mode is not working

I am getting an error in launching the standalone Spark driver in cluster mode. As per the documentation, it is noted that cluster mode is supported in the Spark 1.2.1 release. However, it is currently not working properly for me. Please help me in fixing the issue(s) that are preventing the proper functioning of Spark.
I have 3 node spark cluster node1 , node2 and node 3
I running below command on node 1 for deploying driver
/usr/local/spark-1.2.1-bin-hadoop2.4/bin/spark-submit --class com.fst.firststep.aggregator.FirstStepMessageProcessor --master spark://ec2-xx-xx-xx-xx.compute-1.amazonaws.com:7077 --deploy-mode cluster --supervise file:///home/xyz/sparkstreaming-0.0.1-SNAPSHOT.jar /home/xyz/config.properties
driver gets launched on node 2 in cluster. but getting exception on node 2 that it is trying to bind to node 1 ip.
2015-02-26 08:47:32 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-02-26 08:47:32 INFO Slf4jLogger:80 - Slf4jLogger started
2015-02-26 08:47:33 ERROR NettyTransport:65 - failed to bind to ec2-xx.xx.xx.xx.compute-1.amazonaws.com/xx.xx.xx.xx:0, shutting down Netty transport
2015-02-26 08:47:33 WARN Utils:71 - Service 'Driver' could not bind on port 0. Attempting port 1.
2015-02-26 08:47:33 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-02-26 08:47:33 ERROR Remoting:65 - Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1765)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1756)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:33)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: ec2-xx-xx-xx.compute-1.amazonaws.com/xx.xx.xx.xx:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
kindly suggest
Thanks`enter code here`

It is not possible to bind to port 0. There is/are errors in your spark configuration. Specifically look at the
spark.webui.port
It is probably set to 0.

Spark 1.2.1 standalone cluster mode spark-submit is not working

I have 3 node spark cluster
node1 , node2 and node 3
I running below command on node 1 for deploying driver
/usr/local/spark-1.2.1-bin-hadoop2.4/bin/spark-submit --class com.fst.firststep.aggregator.FirstStepMessageProcessor --master spark://ec2-xx-xx-xx-xx.compute-1.amazonaws.com:7077 --deploy-mode cluster --supervise file:///home/xyz/sparkstreaming-0.0.1-SNAPSHOT.jar /home/xyz/config.properties
driver gets launched on node 2 in cluster. but getting exception on node 2 that it is trying to bind to node 1 ip.
2015-02-26 08:47:32 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-02-26 08:47:32 INFO Slf4jLogger:80 - Slf4jLogger started
2015-02-26 08:47:33 ERROR NettyTransport:65 - failed to bind to ec2-xx.xx.xx.xx.compute-1.amazonaws.com/xx.xx.xx.xx:0, shutting down Netty transport
2015-02-26 08:47:33 WARN Utils:71 - Service 'Driver' could not bind on port 0. Attempting port 1.
2015-02-26 08:47:33 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-02-26 08:47:33 ERROR Remoting:65 - Remoting error: [Startup failed] [
akka.remote.RemoteTransportException: Startup failed
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)
at akka.remote.Remoting.start(Remoting.scala:201)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1765)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1756)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:33)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: ec2-xx-xx-xx.compute-1.amazonaws.com/xx.xx.xx.xx:0
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
kindly suggest
Thanks

after spending lot more time.i got the answer.i did below changes
remove entry of SPARK_LOCAL_IP and SPARK_MASTER_IP
add host name and private ip address of each other nodes in etc/hosts.
use --deploy-mode cluster --supervise
thats all and it works perfectly with fully HA components(Master,Slaves and Driver)
Thanks

Cluster mode is not supported in EC2 1.2 instances where it creates a standalone cluster. Hence you can try removing
--deploy-mode cluster --supervise

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

datastax - Failed to connect to DSE resource manager on spark-submit - apache-spark

For some reason I am unable to pin point, I can run it as mentioned in cluster mode but not in client mode

Related

Best approach to remove cassandra-topology.properties file in running cluster nodes

Exception encountered during startup: Unknown commitlog version 6

Datastax Spark worker is always looking for master at 127.0.0.1

spark-submit cluster mode is not working

Spark 1.2.1 standalone cluster mode spark-submit is not working

Categories

Resources