Does it possible to partial keyspaces to another DC? - cassandra

I have 4 keyspaces and 1 "DC1," with 3 keyspaces using SimpleStrategy and 1 of them using NetworkTopologyStrategy`. I want to add a new "DC2" data center. Replication will be set as DC1:3 and DC2:3.
Will it possible to replicate only one keyspace to the new DC? This means that 3 keyspaces only replicate to DC1 and 1 keyspace replicates to BOTH DC1 and DC2.
Please confirm, is it possible for partial replication between DCs or not?

Yes, you can replicate keyspace between only required DCs - there is no requirement for replicating to all data centers, except some system keyspaces.
But be careful - before adding the new DC, change all existing keyspaces to use network topology strategy - if you continue to use simple strategy, data from that keyspaces will go to new DC as well. Refer to DSE Admin guide for details on how to do it

Related

Cassandra Replication Factor

Lets say I have two Data Centers(DC1, DC2) in a Single Cassandra cluster.
DC1 - 4 nodes.
DC2 - 4 nodes.
Initially i have set the replication factor for all the keyspaces to be {DC1:2 , DC2:2}.(Network topology strategy)
But After some time lets say I alter the keyspace and change the replication factor to {DC:2} for all the keyspaces.(removing DC1).No replication factor for DC1.
So now what will happen? Will DC1 get any data written into it in the future?
Will all the token ranges be assigned to only DC2?
If you exclude DC1 - it won't get data written for that keyspace, nor data will be read from the DC1. Before switching off DC1, make sure that you perform nodetool repair on the serves in DC2, to make sure that you have all data synchronized. After changing RF, you
When you change RF for specific keyspace, drivers and Cassandra itself recalculate the token ranges assignments taking into account information about data centers.

Cassandra multi-DC:write on LOCAL and read from any DC

We use a multi-data center (DC) cassandra cluster. During write on to the cluster, I want only LOCAL DC to perform writes on its nodes as we are already routing the write requests to the desired DC only based on the source from where write is initiated. So, I want only LOCAL DC to process the write and no other DC to perform the writes on its nodes. But later on by virtue of replication among nodes across DCs, I want the written data to be replicated across DCs. Is this replication across DCs possible when I am restricting the write to only one DC in the first place.If I do not open connections to REMOTE hosts lying in different DCs during my write operation, is data replication possible amongst DCs later on. Why I definitely need replicas of data in all DCs is because during data read from cluster, we want the data to be read from any DC the read request falls on, not necessarily the LOCAL one.
Do anyone has solution to this?
You may want to use Local_Quorum consistency for writes if you want to perform them in only Local DC.
Check keyspace definition for the one you want these restriction. It should have class "Network topology" and RF in both DC. Something like this:
ALTER KEYSPACE <Keyspace_name> WITH REPLICATION =
{'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};
It states that after consistency is satisfied Cassandra will propagate the writes to another DC.
Use Quorum consistency for reads if they are not restricted to one DC but be aware that it might add bit latency because Cassandra has to read data from other data center as well.

Replication factor in Cassandra

I am newbie to cassandra.
What exactly replication factor in cassandra means?
For example,
I have 3 node cluster(node1,node2,node3) and If I create keyspace with replication factor 1,and insert data through node1,Can I read the data from other 2 nodes?
Or It will store the data in node1. Is data available in other 2 nodes for read/write operations?
The total number of replicas across the cluster is referred to as the replication factor. A replication factor of 1 means that there is only one copy of each row on one node. You should be able to read/write data from the other two nodes, depending on ports and firewalls between nodes.

Can Cassandra support multi-DC cluster with different number of nodes?

I want to be able to get the backup/replicarw of operational data to a single node so we can do some adhoc queries.
Having just one machine handle this replica will be work for now.
Is this possible ? If not what are the arguments against it ?
Yes, you can have different number of nodes in each data center. Set the replication factor as per your requirement.
E.g. If you have DC1 with 4 nodes and going to add DC2 with 1 node then replication factor for your keyspace should be DC1=x,DC2=1(where x<=4).
To add one more data center you need to check the Topology, Snitch and seeds configurations.
E.g. If you are using SimpleSnitch then you can't have multiple data centers, So you need to change your snitch and topology. Check this link which explains more about changing snitch and topology.

Decommissioning One of Two Datacenter

I have one cassandra datacenter. Let's name it DC1. Then I added new datacenter for extending purporse in nodesize. Let's name it DC2. I use replication_factors DC1:3 and DC2:3. I write all my data as LocalDC=DC2 and ConsistencyLevel.LocalQuorom. I am sure that all write requests go to DC2. I want to remove DC1 and I dont want to run nodetool repair command. I dont want to wait.
Can I just simply change all keypaces replication_factor DC2:3 and run nodetool decommission on DC1 nodes?
Yes.
As you said you are sure that there is no data latency between two data centers you can skip that step.
Change all key space replication strategy using ALTER
Decommission all the nodes one by one
See this: https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_decomission_dc_t.html

Resources