How to safely decommission Analytics DC in Cassandra? - cassandra

I have a Cassandra cluster with 6 nodes, 3 are in the main Cassandra DC, and 3 are in an Analytics DC. I no longer need the Analytics DC, and I want to decommission it. I want to make sure I am doing it safely, in that I do not want to affect the Cassandra DC or my clients. I have only one keyspace that is replicated across the DC's, and I was planning to use ALTER KEYSPACE to simply remove the replication to the Analytics DC. After that, I'd just terminate the Analytics nodes in ec2. Is this a safe plan?

1) Use ALTER KEYSPACE to remove the analytics DC from the replication strategy.
2) Use nodetool decommission to safely remove those nodes from the cluster (one at a time, ideally). They'll no longer own any data, so they'll have nothing to stream to their neighbors.
3) Terminate the instances.

Related

Cassandra repair after datacenter went down

I have a Cassandra db (version 3.11.2) running in AWS, with 2 Datacenters - each in another AWS region and 3 nodes in each one.
The replication factor on all keyspaces is 3, so full replication of data on every node. The size of data is about 10GB per node.
All of our writes are in LOCAL_QUORUM against one DC (lets call it DC1). Basically the other DC is just for a kind of backup and disaster recovery, in case the AWS region for DC1 will be unavailable we will redirect traffic to DC2.
My issue is that we had a network disconnection between the two DCs, for several hours, and after several days we noticed that there is missing data in DC2. This all makes sense, since the time the DCs were apart is larger than the Hinted Handoff window (3 hours). So we need to run a repair to bring DC2 back to sync with DC1.
I went over the cassandra docs, and read countless SO answers and for the life of me I couldn't understand what is the right repair to do...
Do I need to issue a 'nodetool repair --full --sequential' from only one node? Do I need to run it on every node in the cluster? Maybe it's better to run 'nodetool rebuild'?
Executing nodetool cleanup on the nodes on datacenter2 should be able to bring up the data up to sync, but depending on the data size affected, this may be a task that can take time and resources. If the datacenter2 is only as a backup for disaster recovery purposes, it may be easier and quicker to backup the current dc1 cluster and restore it in the second datacenter (more information is available here.

Creating Cassandra sub-clusters

I need to create K overlapping Cassandra clusters on N machines (K>>N). Each cluster can have between 1 to N nodes. I know that one way of doing so is to create a separate process (or docker container) for each cluster a node is a member of.
My question however is that can I change Cassandra to allow the creation of sub-clusters? meaning that there would be only 1 Cassandra instance running on each node, but I would be able to take control of data replication and data placement so that within a sub-cluster for example, I would be able to do a Quorum write for example.
No, it's not possible to define the sub-cluster as you describe - there is always a single Cassandra cluster per process.
But Cassandra has a notion of the Datacenter that defines where machine resides, and the keyspace that defines how the data is replicated between datacenters and nodes. And consistency level, like, QUORUM depends on the keyspace configuration.
In your case I would think in that direction - define datacenters, create necessary keyspaces, and setup correct replication factors for that keyspaces.

Cassandra token assignment for multiple DC cluster

I am a little confused about the token assignment for multiple DC. When running nodetool ring, we can see all the tokens are/need to be different even for different DC's node. Does that mean all nodes in cluster form only one ring, or each DC's nodes form a ring in each DC?
That's right, the Cassandra token range spans the entire cluster, so there will only be one primary node responsible for any piece of data.
Managing data across multiple datacenters is handled by specifying the desired replication strategy, e.g. NetworkTopologyStrategy
Each DCs nodes form a ring in its own cluster. For multiple DCs, each DC has its own distinct partition range independent of each other. As #alec-collier mentioned NetworkTopologyStrategy will figure out the replicas for a partition within each DC.

Cassandra and Spark

Hi I have a high level question regarding cluster topology and data replication with respect to cassandra and spark being used together in datastax enterprise.
It was my uderstanding that if there were 6 nodes in a cluster and there is heavy computing (e.g analytics) done then you could have three spark nodes and 3 cassandra nodes if you want. Or you don't need three nodes for analytics but your jobs would not run as fast. The reason you don't want the heavy analytics on the cassandra nodes is because the local memory is already being used up to handle the heavy read/write load of cassandra.
This much is clear, but here are my questions :
How does the replicated data work then?
Are all the cassandra only nodes in one rack, and all the spark nodes in another rack?
Does all the data get replicated to the spark nodes?
How does that work if it does?
What is the recommended configuration steps to make sure the data is replicated properly to the spark nodes?
How does the replicated data work then?
Regular Cassandra replication will operate between nodes and DC's. As far as replication goes this is the same as having a c* only cluster with two data centers.
Are all the cassandra only nodes in one rack, and all the spark nodes in another rack?
With the default DSE Snitch, your C* nodes will be in one DC and the Spark nodes in another DC. They will all be in a default rack. If you want to use multiple racks you will have to configure that yourself by using an advanced snitch. GPFS or PFS are good choices depending on your orchestration mechanisms. Learn more in the DataStax Documentation
Does all the data get replicated to the spark nodes? How does that work if it does?
Replication is controlled at the keyspace level and depends on your replication strategy:
SimpleStrategy will simply ask you the number of replicas you want in your cluster (it is not data center aware so don't use it if you have multiple DC's)
create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3 }
This assumes you only have one DC and that you'll have 3 copies of each bit of data
NetworkTopology strategy let's you pick number of replicas per DC
create KEYSPACE tst WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 2, 'DC2': 3 }
You can choose to have a different number of replicas per DC.
What is the recommended configuration steps to make sure the data is replicated properly to the spark nodes?
The procedure to update RF is in the datastax documentation. Here it is verbatim:
Updating the replication factor Increasing the replication factor
increases the total number of copies of keyspace data stored in a
Cassandra cluster. If you are using security features, it is
particularly important to increase the replication factor of the
system_auth keyspace from the default (1) because you will not be able
to log into the cluster if the node with the lone replica goes down.
It is recommended to set the replication factor for the system_auth
keyspace equal to the number of nodes in each data center.
Procedure
Update a keyspace in the cluster and change its replication strategy
options. ALTER KEYSPACE system_auth WITH REPLICATION = {'class' :
'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2}; Or if using
SimpleStrategy:
ALTER KEYSPACE "Excalibur" WITH REPLICATION = { 'class' :
'SimpleStrategy', 'replication_factor' : 3 }; On each affected node,
run the nodetool repair command. Wait until repair completes on a
node, then move to the next node.
Know that increasing the RF in your cluster will generate lots of IO and CPU utilization as well as network traffic, while your data gets pushed around your cluster.
If you have a live production workload, you can throttle the impact by using nodetool getstreamthroughput / nodetool setstreamthroughput.
You can also throttle the resulting compactions with nodetool getcompactionthroughput nodetool setcompactionthroughput
How does Cassandra and Spark work together on the analytics nodes and
not fight for resources? If you are not going to limit Cassandra at all in the whole cluster, then what is the point of limiting Spark, just have all the nodes Spark enabled.
The key point is that you won't be pointing your main transactional reads / writes at the Analytics DC (use something like consistency level ONE_LOCAL, or QUORUM_LOCAL to point those requests to the C* DC). Don't worry, your data still arrives at the analytics DC by virtue of replication, but you won't wait for acks to come back from analytics nodes in order to respond to customer requests. The second DC is eventually consistent.
You are right in that cassandra and spark are still running on the same boxes in the analytics DC (this is critical for data locality) and have access to the same resources (and you can do things like control the max spark cores so that cassandra still has breathing room). But you achieve workload isolation by having two Data Centers.
DataStax drivers, by default, will consider the DC of the first contact point they connect with as the local DC so just make sure that your contact point list only includes machines in the local (c* DC).
You can also specify the local datacenter yourself depending on the driver. Here's an example for the ruby driver, check the driver documentation for other languages.
use the :datacenter cluster method: First datacenter found will be
assumed current by default. Note that you can skip this option if you
specify only hosts from the local datacenter in :hosts option.
You are correct, you want to separate your cassandra and your analytics workload.
A typical setup could be:
3 Nodes in one datacenter (name: cassandra)
3 Nodes in second datacenter (name: analytics)
When creating your keyspaces you define them with a NetworkTopologyStrategy and a replication factor defined for each datacenter, like so:
CREATE KEYSPACE myKeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'cassandra': 2, 'analytics': 2};
With this setup, your data will be replicated twice in each datacenter. This is done automatically by cassandra. So when you insert data in DC cassandra the inserted data will get replicated to DC analytics automatically and vice versa. Note: you can define what data is replicated by using seperate keyspaces for the data you want to be analyzed and the data you don't.
In your cassandra.yaml you should use the GossipingPropertyFileSnitch. With this snitch you can define the DC and the rack of your node in the file cassandra-rackdc.properties. This information then gets propagated via the gossip protocol. So each node learns the topology of your cluster.

How to separate ring from cluster in cassandra

We have a cassandra DSE cluster with 10 nodes for cassandra ring and 10 nodes for hadoop ring. Now the application writes the data to the cassandra ring and cassandra will replicate the data to hadoop ring.
We want to separate the two ring's and make them as two different cluster's and application writes the data to two clusters at the same time.
How to separate the cluster? is that possible?
we have ~600GB of data in the cluster and we cannot delete it.
You should test this first, but this basic procedure should work. It will need some tweaking if you have counters.
Set your application writing to both DCs using LOCAL_QUORUM.
Run repair on the whole cluster. This is to ensure each DC has a copy of the data.
Isolate the clusters so the two DCs can't talk to each other, probably using a firewall.
Assuming your DCs are DC1 and DC2, change your replication factor to be DC2:0 on DC1 and DC1:0 on DC2.
On each DC, run 'nodetool removenode' for each node in the other DC. This will just remove the DOWN nodes from the ring but won't have any affect on the data because the other nodes have replication factor zero.
This should work with zero data loss.

Resources