I have 3 nodes in cluster
Node1 = 127.0.0.1:9160
Node2 = 127.0.0.2:9161
Node3 = 127.0.0.3:9162
I want to use only one node(node1) for insertion. Other two nodes should be used for fault tolerance on writing millions of records. i.e. when node1 is down either node2 or node3 should take care of writing.For that I formed a cluster with replication factor of 2 and added seed nodes properly in cassandra.yalm file. It is working fine. But due to partition whenever I write the data to the node 1, rows are getting scattered across all the node in the cluster. So is there any way to use the nodes for only replication in the cluster?...Or is there any way to disable the partitioning?...
thanks in advance..
No. Cassandra is a fully distributed system.
What are you trying to achieve here? We have a 6 node cluster with RF=3 and since PlayOrm fixed the config bug they had in astyanax, even if we start getting one slow node, it automatically starts going to the other nodes to keep the system fast. Why would you want to avoid great features like that???? IF your primary node gets slow you would be screwed in your situation.
If you describe your use-case better, we might be able to give you better ideas.
Related
Does Cassandra maintains RF when a node goes down. For e.g. if number of nodes is 5 and RF is 2 then when a single node goes down, does the remaining replica copies it's data to some other node to maintain the RF of 2?
In the Datastax's documentation, it's mentioned that "If a node fails, the load is spread evenly across other nodes in the cluster". Does this mean that migration of data happens when a node goes down? Is this a feature available only in Datastax's Cassandra and not Apache Cassandra?
No, instead a "hint" will be stored in the coordinator node and will get eventually written to the node which owns the token range when the node comes back up - the write will succeed depending on your consistency level. So in the above example the write will succeed if you are writing with consistency level as ONE.
If the node is down only for short period - the node will receive the data back from hints from other nodes when it comes back. But if you decommission a node, then the data gets replicated to other nodes and the other nodes will have the new token ranges (same case when a node is added to the cluster as well).
Over time the data in one replica can become inconsistent with others and the repair process helps Cassandra in fixing them - https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesTOC.html
This is applicable in Apache Cassandra as well.
I have a keyspace with replication factor set to 3 but I have only a single node. Will then the disk space be used 3 times the data size? As the replicas are not yet assigned to any other nodes, will cassandra stop creating replicas unless new nodes join the cluster?
No, the disk space used would not be three times the size. The single node would own the entire token range and all writes would be written to that single node once.
What happens with the writes for the other two replicas would depend on if those nodes were previously present in the cluster and are currently down, or if they have never been added to the cluster. If they had never been added, then C* would just skip trying to write to them.
If they had been added but are currently down, and if you have hinted handoffs enabled and are still within the hinted handoff window, then C* will store hints for the down nodes on the single up node.
It depends on the replication strategy you have used . Assuming your queries are working you might have used SimpleStrategy , if you try to write to such a configuration your write should fail as it needs to write to 2 additional replica node before it gives a acknowledgement to the client ,which in case of SimpleStratagy are the next two clockwise nodes in the Ring.
If you use rack designations and add new racks to a current cluster do you have to rebalance everything out after?
For example we currently have 2 racks. Then we'll double the node count by adding two new racks. Will cassandra have to rebalance the replica's out. The primary tokens will be balanced because the new nodes will have the correct tokens. But the replica's seem like they would be interspersed incorrectly.
If we know we'll be adding racks in the future, but we cannot afford to rebalance the cluster, should we just avoid racks completely in the first place?
Cassandra version is 1.2
Ok, after working with Cassandra for longer. Short answer is yes. Due to how Cassandra saves data if the nodes are not perfectly balanced across racks and in the right order you'd either have to rebalance or physically move the hardware to balance the racks. It seems like it is better to just manually balance the cluster by having the token's alternate racks than using cassandra's internal rack designation. However with vnodes this becomes less of an issue I believe.
We have Cassandra cluster with single token per node, total 22 nodes, average load per node is 500Gb. It has SimpleStrategy for the main keyspace and SimpleSnitch.
We need to migrate all data to the new datacenter and shutdown the old one without a downtime. New cluster has 28 nodes. I want to have vnodes on it.
I'm thinking of the following process:
Migrate the old cluster to vnodes
Setup the new cluster with vnodes
Add nodes from the new cluster to the old one and wait until it balances everything
Switch clients to the new cluster
Decommission nodes from the old cluster one by one
But there are a lot of technical details. First of all, should I shuffle the old cluster after vnodes migration? Then, what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch? I want to switch to NetworkTopologyStrategy because new cluster has 2 different racks with separate power/network switches.
should I shuffle the old cluster after vnodes migration?
You don't need to. If you go from one token per node to 256 (the default), each node will split its range into 256 adjacent, equally sized ranges. This doesn't affect where data lives. But it means that when you bootstrap in a new node in the new DC it will remain balanced throughout the process.
what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch?
The difficulty is that switching replication strategy is in general not safe since data would need to be moved around the cluster. NetworkToplogyStrategy (NTS) will place data on different nodes if you tell it nodes are in different racks. For this reason, you should move to NTS before adding the new nodes.
Here is a method to do this, after you have upgraded the old cluster to vnodes (your step 1 above):
1a. List all existing nodes as being in DC0 in the properties file. List the new nodes as being in DC1 and their correct racks.
1b. Change the replication strategy to NTS with options DC0:3 (or whatever your current replication factor is) and DC1:0.
Then to add the new nodes, follow the process here: http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-a-data-center-to-a-cluster. Remember to set the number of tokens to 256 since it will be 1 by default.
In step 5, you should set the replication factor for DC0 to be 0 i.e. change replication options to DC0:0, DC1:3. Now those nodes aren't being used so decommission won't stream any data but you should still do it rather than powering them off so they are removed from the ring.
Note one risk is that writes made at a low consistency level to the old nodes could get lost. To guard against this, you could write at CL.LOCAL_QUORUM after you switch to the new DC. There is still a small window where writes could get lost (between steps 3 and 4). If it is important, you can run repair before decommissioning the old nodes to guarantee no losses or write at a high consistency level.
If you are trying to migrate to a new cluster with vnodes, wouldn't you need to change the Partitioner. The documents say that it isn't a good idea to migrate data between different Partitioners.
With the objective of speeding up the migration process of a full production cassandra cluster, I would like to know if anyone has tried to simultaneously run cassandra's sstableloader from two nodes at the same time.
Those nodes would be out of the destination cassandra's ring and they would stream different data to the ring.
Has anyone tried this?
Thank you.
I have tried this with multiple simultaneous sstableloaders without any issue. In my case the SSTable-sets were created by a map-reduce job resulting in a set of SSTables per reducer that were later loaded via the sstableloader.