Three node Cassandra cluster all nodes configured different rack in same dc - cassandra

I am a newbie in Cassandra.
In our production environment three node Cassandra clusters are running and serving production traffic but I have below mentioned doubts:-
1) All nodes are configured in different racks i.e rack 1, rack 2 and rack 3 in the same dc. Is it fine or does this configuration have some drawbacks?
2) We are using rf2 and network topology for all the keyspaces except system tables and these system tables are configured with rf2 and simplestrategy ..is it fine or does this need to be changed? should we increase the replication factor of system_auth? ..please let me know..
3) Now I want to add another node in the same dc, what will be the best procedure to do the same without impacting the live traffic?
Cassandra version is Apache cassandra 3.11.
Thanks in advance..

Ans 1) It seems good to have Cassandra nodes in different racks for availability and fault tolerance .
Ans 2) You must increase RF on system_auth so that you can avoid cqlsh login issue from other nodes.
Ans 3) You can add new node without affecting the live traffic on existing cluster. please follow below procedure.
http://cassandra.apache.org/doc/latest/operating/topo_changes.html

Cassandra is designed as a distributed system. Cassandra’s distributed architecture is specifically tailored for multiple-data center deployment. These features are robust and flexible enough that you can configure the cluster for optimal geographical distribution, for redundancy for fail-over and disaster recovery.
Multiple data center deployments are excellent for global solutions where in some applications are operational in one region and other applications in another region and yet using a single cluster of Cassandra which is working in multiple data centers across regions.
For single region applications, still having multiple data-centers is preferred option because it provides disaster recovery even in case one region goes down.
Ans 1) For a single DC Cassandra cluster , recommendation is to have 4 nodes with RF3. Rack 1 with 2 nodes and Rack 2 with 2 nodes. Remember that nodes in the same rack have faster network than nodes in different racks. With two nodes on the same Rack, queries with LOCAL_QUORUM will be faster as compared to queries on a cluster with all nodes on different racks.
If you are not concerned with the query latency , all nodes in different racks (3 racks) will give better disaster recovery as compared with two RACK deployment. Having said that, it's always recommended to use multi DC deployments for production cluster.
Ans 2) It’s always recommended to increase the replication factor of System_auth keyspace and change the replication class to NetworkTopologyStrategy. Please follow this documentation for more details https://docs.datastax.com/en/security/6.0/security/secSystemKeyspace.html
Ans 3) Yes, You can add a new node to existing cluster with ease without impacting the traffic. Please follow this documentation for more details: https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html

Related

Add datacenter with one node to backup existing one

I already have a working datacenter with 3 nodes (replication factor 2). I want to add another datacenter with only one node to have all backup data from existing datacenter. The final solution:
dc1: 3 nodes (2 rf)
dc2: 1 node (1 rf)
My application would then connect only to dc1 nodes and send data. If dc1 breaks down I can recover data from dc2 which is on the other physical machine in different location. I could also use dc2 for AI queries or some other task. I'm a newbie in case of cassandra configuration so I want to know if I'm not making some kind of a mistake in my thinking. I'm planing on using this configuration docs to add new dc: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsAddDCToCluster.html
Is there anything more I should keep in mind to get this to work or some easier solution to have data backup?
Update: It won't only be a backup, we wont to use this second DC for connecting application also when dc1 would be unavailable (ex. power outage).
Update: dc2 is running, I had some problems with coping data from one dc to other and nodetool status didn't show 2 dc's but after fixing firewall rules for port 7000 I managed to connect both dc's and share data between them.
with this approach, your single node will get 2 times more traffic than other nodes. Also, it may add a load to the nodes in dc1 because they will need to collect hints, etc. when node in dc2 is not available. If you need just backup, setup something like medusa, and store data in the cheap environment, like, S3 - but of course, it will require time to restore if you lose the whole DC.
But in reality, you need to think about your high-availability strategy - what will happen with your clients if you lose the primary DC? Is it critical to wait until recovery, or you're really requiring full fault tolerance? I recommend to read the Designing Fault-Tolerant Applications with DataStax and Apache Cassandra™ whitepaper from DataStax - it explains the details of designing really fault tolerant applications.

Canssandra replication within different data centers

Currently, we are facing a problem, suppose that we have GCP and Azure as our cloud solution, we want to set up the cassandra cluster like this,
3 nodes on GCP as the cluster, data is stored by setting replicators
1 node on Azure(only hot copy) to keep all data for 3 nodes on GCP.
Is this doable?
Yes, it should be possible declaring them as two different datacenters. Cassandra allows you to have different replication factors depending on the DC used:
replication = {'class': 'NetworkTopologyStrategy', 'gcp1': 3, 'azure1': 1};
This is a use case similar to the one explained here
There are gotchas, first, you need to ensure that the communication between the datacenters can be established; as the traffic will go through the internet, it will be highly recommendable that SSL encryption is enabled for the communication between datacenters.
The gcp1 datacenter should be set to use the GoogleCloudSnitch snitch, the definition of the topology is done in cassandra-rackdc.properties.
For the azure1 datacenter in Azure can be tricky as it should run in virtual machines, you can find a way to set this up here, note that the snitch in this DC will be GossipingPropertyFileSnitch, and the topology is defined in cassandra-topology.properties.
The instructions of how to add the second datacenter can be found here.

Creating Cassandra sub-clusters

I need to create K overlapping Cassandra clusters on N machines (K>>N). Each cluster can have between 1 to N nodes. I know that one way of doing so is to create a separate process (or docker container) for each cluster a node is a member of.
My question however is that can I change Cassandra to allow the creation of sub-clusters? meaning that there would be only 1 Cassandra instance running on each node, but I would be able to take control of data replication and data placement so that within a sub-cluster for example, I would be able to do a Quorum write for example.
No, it's not possible to define the sub-cluster as you describe - there is always a single Cassandra cluster per process.
But Cassandra has a notion of the Datacenter that defines where machine resides, and the keyspace that defines how the data is replicated between datacenters and nodes. And consistency level, like, QUORUM depends on the keyspace configuration.
In your case I would think in that direction - define datacenters, create necessary keyspaces, and setup correct replication factors for that keyspaces.

practicallity of having a single node cassandra multisitecluster(3 way)

Is it possible to put a Cassandra cluster with single node DC with 2 remote DC which is also having a single node assuming the replication factor is required to be 3 in this case? The remote cluster is in the same geographical area but not same building for HA. Or is there any hard rules that for high availability and consistency for a need for a local quorum node to achieve that?
Our setup may be smaller compared to big data and usually used to store time series data with approximately 2000/3000 (on different key) sampling per second.
Is there other implications other than read/write may be slow due to the comms delay?
disclaimer: I am new to cassandra.
Turns out I want to deploy a similar setup: 3 nodes on aws, each in its own AZ (But all in the same region). from what I read, this setup is just a single DC, with 3 nodes.
You need to use Ec2Snitch to reduce the latency between your clients and the nodes.
Using RF=3 provides you with the HA that you need, since every node has all the data
Inter-AZ communication should be fairly fast. refer to this: http://highscalability.com/blog/2016/8/1/how-to-setup-a-highly-available-multi-az-cassandra-cluster-o.html
becuase you'll be running in a single DC, local-quorum == quorum. so as long as you'll be writing to QUROUM (which requires 2/3 nodes (AZs) to be up), you'll be strongly consistent and HA.

Ability to write to a particular cassandra node

Is there a possibility to write to a particular node using datastax driver?
For example, I have three nodes in datacenter 1 and three nodes in datacenter 2.
Existing
If i build up the cluster with any one of them as seed, all the nodes will get detected by the datastax java driver. So, in this case, if i insert a data using driver, it will automatically choose one of the nodes and proceed with it as the co-ordinator(preferably local data center)
Requirement
I want a way to contact any node in datacenter 2 and hand over the co-ordinator job to one of the nodes in datacenter 2.
Why i need this
I am trying to use the trigger functionality from datacenter 2 alone. Since triggers are taken care by co-ordinator , i want a co-ordinator to be selected from datacenter 2 so that data center 1 doesnt have to do this operation.
You may be able to use the DCAwareRoundRobinPolicy load balancing policy to achieve this by creating the policy such that DC2 is considered the "local" DC.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("dc2"));
In the above example, remote (non-DC2) nodes will be ignored.
There is also a new WhiteListPolicy in driver version 2.0.2 that wraps another load balancing policy and restricts the nodes to a specific list you provide.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new WhiteListPolicy(new DCAwareRoundRobinPolicy("dc2"), whiteList));
For multi-DC scenarios Cassandra provides EACH and LOCAL consistency levels where EACH will acknowledge successful operation in each DC and LOCAL only in local one.
If I understood correctly, what you are trying to achieve is DC failover in your application. This is not a good practice. Let's assume your application is hosted in DC1 alongside with Cassandra. If DC1 goes down, your entire application is unavailable. If DC2 goes down, your application still can write with LOCAL CL and C* will replicate changes when DC2 is back.
If you want to achieve HA, you need to deploy application in each DC, use CL=LOCAL_X and finally do failover on DNS level (e.g. using AWS Route53).
See data consistency docs and this blog post for more info about consistency levels for multiple DCs.

Resources