We want to deploy DSE cluster of 3 nodes, where each node is Analytics running Spark.
We want to use vnodes in cassandra, because it enables much more even data distribution and easier adding of the nodes. We deploy DSE on AWS, using one of the available AMI images.
Although DSE by default deploys Cassandra cluster using single token nodes, we have to manually change cassandra.yaml file on all the nodes.
According to datastax documentation, I should:
uncomment num_tokens field (I left 256 default value)
leave initial_token field unassigned
After that, when I do nodetool status command, I see that my cluster still uses single token mode.
According to this, I should restart nodes in the cluster, so that changes take effect.
But after nodes are restarted both thru OPS center or AWS console, I get errors, nodes are in unresponsive state, and I cannot use nodetool command on my nodes, with error:
Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused'.
Is there something that I am doing wrong?
How to enable vnodes on DSE when deployed using AMI image?
Thank you
Related
Previously we had three nodes cluster with two Cassandra nodes datacenter in one dc and one spark enabled node in different dc.
Spark was running smoothly in that configurations.
Then we tried adding another node in analytics dc with spark enabled. We had configured GossipingPropertyFileSnitch as well as added seeds.
But now when we start the cluster, spark master is assigned to both the nodes separately. So spark job still runs on a single node. What configurations are we missing regarding running spark job in a cluster?
Most probably you didn't make an adjustments in the Analytics keyspace replication, or didn't run the repair after you added a node. Please refer to instructions in official documentation.
Also, please check that you configured the same DC for both of Analytics nodes, because the Spark master is elected per DC.
We are running 6 node cluster with:
HADOOP_ENABLED=0
SOLR_ENABLED=0
SPARK_ENABLED=0
CFS_ENABLED=0
Now, we would like to add Spark to all of them. It seems like "adding" is not the right term because this would not fail. Anyways, the steps we've done:
1. drained one of the nodes
2. changed /etc/default/dse to SPARK_ENABLED=1 and HADOOP_ENABLED=0
3. sudo service dse restart
And got the following in the log:
ERROR [main] 2016-05-17 11:51:12,739 CassandraDaemon.java:294 - Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics) differs from previous data center (Cassandra). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true.
There are two related questions that have been already answered:
Unable to start solr aspect of DSE search
Two node DSE spark cluster error setting up second node. Why?
Unfortunately, clearing the data on the node is not an option - why would I do that? I need the data to be intact.
Using "-Dcassandra.ignore_rack=true -Dcassandra.ignore_dc=true" is a bit scary in production. I don't understand why DSE wants to create another DC and why can't it just use the existing one?
I know that according to datastax's doc one should partition the load using different DC for different workloads. In our case we just want to run SPARK jobs on the same nodes that Cassandra is running using the same DC.
Is that possible?
Thanks!
The other answers are correct. The issue here is trying to warn you that you have previously identified this node as being in another DC. This means that it probably doesn't have the right data for any key-spaces with Network Topology Strategy. For example if you had a NTS keyspace which had only one replica in "Cassandra" and changed the DC to "Analytics" you could inadvertently lose all of the data.
This warning and the accompanying flag are telling you that you are doing something that you should not be doing in a production cluster.
The real solution to this is to explicitly name your dc's using GossipingFileSnitch and not rely on SimpleSnitch which names based on the DSE workload.
In this case, switch to GPFS and set the DC name to Cassandra.
I had setup a 50 node Apache Cassandra cluster
I took one node and wanted to install DSE on it and make is a single node DSE cluster
I have removed /var/lib/cassandra and /var/log/cassandra
I have truncated systems.peers table on the single node
When I start dse cassandra on this node, I still see the remaining nodes doing handshake and being added to this cluster.
What is the best way to complete remove any traces of existing Cassandra cluster from this node?
You need to change the cluster_name directive in cassandra.yaml to a different name to the rest of the cluster.
I'm trying to add two Spark analytics nodes to my cassandra cluster, and I keep getting the error message:
"Error provisioning cluster: Must pass node_type_counts when adding to a vnode cluster."
I am using DSE 4.6.4 and OpsCenter 5.1.1. The 3 nodes in my Cassandra DC use vnodes, and I am trying to use vnodes in the analytics cluster as well, by setting num_tokens in the config to 256. Is this a bug in OpsCenter? Seems like the node_type_counts param might be missing from the post request that OpsCenter creates. Any help is appreciated.
I am using Cassandra 2.0 and cluster has been setup with 3 nodes. Nodetool status and ring showing all the three nodes. I have specified tokens for all the nodes.
I followed the below steps to change the configuration in one node:
1) sudo service cassandra stop
2) updated cassandra.yaml (to update thrift_framed_transport_size_in_mb)
3) sudo srevice cassandra start
The specific not started successfully and system.log shows below exception:
org.apache.cassandra.exceptions.ConfigurationException: Cannot change
the number of tokens from 1 to 256
What is best mechanism to restart the node without losing the existing data in the node or cluster ?
Switching from Non-Vnodes to Vnodes has been a slightly tricky proposition for C* and the mechanism for previously performing this switch (shuffle) is slightly notorious for instability.
The easiest way forward is to start fresh nodes (in a new datacenter) with vnodes enabled and to transfer data to those nodes via repair.
I was also getting this error while I was trying to change the number of tokens from 1 to 256. To solve this I tried the following:
Scenario:
I have 4 node DSE (4.6.1) cassandra cluster. Let say their FQDNs are: d0.cass.org, d1.cass.org, d2.cass.org, d3.cass.org. Here, the nodes d0.cass.org and d1.cass.org are the seed providers. My aim is to enable nodes by changing the num_token attribute in the cassandra.yaml file.
Procedure to be followed for each node (one at a time):
Run nodetool decommission on one node: nodetool decommission
Kil the cassandra process on the decommissioned node. Find the process id for dse cassandra using ps ax | grep dse and kill <pid>
Once the decommissioning of the node is successful, go to one of the remaining nodes and check the status of the cassandra cluster using nodetool status. The decommissioned node should not appear in the list.
Go to one of the active seed_providers and type nodetool rebuild
On the decommissioned node, open the cassandra.yaml file and uncomment the num_tokens: 256. Save and close the file. If this node was originally seed provider, make sure that it's ip-address is removed from the seeds: lists from cassandra.yaml file. If this is not done, the stale information about the cluster topology it has will hinder with the new topology which is being provided by the new seed node. On successful start, it can be added again in the seed list.
Restart the remaining cluster either using the corresponding option in opscenter or manually stopping cassandra on each node and starting it again.
Finally, start cassandra on it using dse cassandra command.
This should work.