YugabyteDB xCluster replication issues with WAL - yugabytedb

Create two new Cluster (NYC and FRA) — fresh (1 node per region)
Create a new table in each NYC and FRA
Create a bi-directional replication for the table (xCluster)
All works
Delete replication (1 and 2) both direction
Alter table (add column) in each cluster (NYC and FRA)
Create replication (1 and 2) both direction. One direction works, the opposite will fail with error unable to change the WAL retention time for table)
recreate the full cluster

Related

How does allocate_tokens_for_local_replication_factor actually work?

I'm trying to add a new node into existing cluster and allocate_tokens_for_local_replication_factor is 2 for the current nodes but the default value from newly installed Cassandra is 3. I was tried to find information about this configuration but can't find a clear description how this configuration is working.
The allocate_tokens_for_local_replication_factor works in much the same way as allocate_tokens_for_keyspace where it triggers an algorithm that attempts to choose tokens such that the load (data density) is balanced or optimised for the nodes in the local data centre.
The main difference is that allocate_tokens_for_local_replication_factor optimises the algorithm for a defined replication factor of 3 (default) instead of the replication factor for a given keyspace (allocate_tokens_for_keyspace).
allocate_tokens_for_keyspace was added in Cassandra 3.0 (CASSANDRA-7032) to improve token allocation for clusters configured with virtual nodes. However, it suffers from the problem where the replication factor of a keyspace cannot be used when adding a new DC since the keyspace is not replicated to the new DC yet.
Cassandra 4.0 solved this problem by allowing operators to specify the replication factor to be used for optimising the token allocation algorithm with allocate_tokens_for_local_replication_factor (CASSANDRA-15260).
In your case where the existing nodes have:
allocate_tokens_for_local_replication_factor: 2
a previous operator would have configured it that way because the application keyspace(s) had a replication factor of 2.
As a side note, thanks for bringing this to our attention. I have logged CASSANDRA-17984 so we could improve the docs. Cheers!

Cassandra sometimes skips records in SELECT query

My setup is:
cassandra 1.2.19
single datacenter cluster with 4 nodes
NetworkTopologyStrategy with replication factor of 3
consistency level of writes to the db is set to LOCAL_QUORUM
I am trying to iterate all records in a given table and I do so with some legacy application code which fetches the data in batches with consecutive select queries of this type:
SELECT * FROM records WHERE TOKEN(partition_key) > last_partition_key_of_previous_batch LIMIT 1000;
The problem is that sometimes some records are skipped. I also noticed that those skipped records are old, added months ago to the database.
All of the select queries are executed with consistency level ONE.
Is it possible that this is the cause?
From what I understood about consistency levels when the consistency level for reads is ONE, only one node is asked to execute the query.
Is it possible that sometimes the node that executes the query does not contain all the records and that's why sometimes some records are missing?
Changing the consistency level of the query to QUORUM fixed the issue.

Alter Keyspace on cassandra 3.11 production cluster to switch to NetworkTopologyStrategy

I have a cassandra 3.11 production cluster with 15 nodes. Each node has ~500GB total with replication factor 3. Unfortunately the cluster is setup with Replication 'SimpleStrategy'. I am switching it to 'NetworkTopologyStrategy'. I am looking to understand the caveats of doing so on a production cluster. What should I expect?
Switching from mSimpleStrategy to NetworkTopologyStrategy in a single data center configuration is very simple. The only caveat of which I would warn, is to make sure you spell the data center name correctly. Failure to do so will cause operations to fail.
One way to ensure that you use the right data center, is to query it from system.local.
cassdba#cqlsh> SELECT data_center FROM system.local;
data_center
-------------
west_dc
(1 rows)
Then adjust your keyspace to replicate to that DC:
ALTER KEYSPACE stackoverflow WITH replication = {'class': 'NetworkTopologyStrategy',
'west_dc': '3'};
Now for multiple data centers, you'll want to make sure that you specify your new data center names correctly, AND that you run a repair (on all nodes) when you're done. This is because SimpleStrategy treats all nodes as a single data center, regardless of their actual DC definition. So you could have 2 replicas in one DC, and only 1 in another.
I have changed RFs for keyspaces on-the-fly several times. Usually, there are no issues. But it's a good idea to run nodetool describecluster when you're done, just to make sure all nodes have schema agreement.
Pro-tip: For future googlers, there is NO BENEFIT to creating keyspaces using SimpleStrategy. All it does, is put you in a position where you have to fix it later. In fact, I would argue that SimpleStrategy should NEVER BE USED.
so when will the data movement commence? In my case since I have specific rack ids now, so I expect my replicas to switch nodes upon this alter keyspace action.
This alone will not cause any adjustments of token range responsibility. If you already have a RF of 3 and so does your new DC definition, you won't need to run a repair, so nothing will stream.
I have a 15 nodes cluster which is divided into 5 racks. So each rack has 3 nodes belonging to it. Since I previously have replication factor 3 and SimpleStrategy, more than 1 replica could have belonged to the same rack. Whereas NetworkStrategy guarantees that no two replicas will belong to the same rack. So shouldn't this cause data to move?
In that case, if you run a repair your secondary or ternary replicas may find a new home. But your primaries will stay the same.
So are you saying that nothing changes until I run a repair?
Correct.

Is there a way to mirror Cassandra table across different keyspaces

Requirement:
We have a particular transaction table retail_mapping in cassandra keyspace "account".
We have another keyspace "dp" where the exact table data "retail_mapping" needs to be replicated and accessed by micro-services.
1) Is there any way we can create a mirror table retail_mapping in dp keyspace coming from account keyspace.
2) Any data which is persisted in account keyspace also needs to be copied into dp keyspace immediately
For your first question, you could create a snapshot (to do on each nodes) and copy the data file to the other table, followed by a nodetool refresh.
For your second question, the best is to achieve that at the application layer. If that s not an option then your best option is to look at Cassandra triggers.

Data Replication In Cassandra

I am trying to understand data replication in Cassandra. In my case, I have to store a huge number of records into a single table based on yymmddhh primary key partition.
I have two data centers (DC1 and DC2) and I created a keyspace using below CQL.
CREATE KEYSPACE db1 WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'DC1' : 1, 'DC2' : 1 };
And then created a new table tbl_data using below CQL
CREATE TABLE db1.tbl_data (
yymmddhh varchar,
other_details text,
PRIMARY KEY (yymmddhh)
) WITH read_repair_chance = 0.0;
Now, I can see that the above keyspace "db1" and table "tbl_data" created successfully. I have few millions of rows to insert, I am assuming that all rows will be stored on both servers i.e. DC1 and DC2 since replication factor is 1 of both data centers.
Suppose, after some time I need to add more nodes since number of records can increase to billions, so in that case one data center can't handle that huge number of records due to disk space limitation.
a) So, how can I divide data into different nodes and can add new nodes on demand?
b) Do I need to alter keyspace "db1" to put name of new data centers in the list?
c) How the current system will work horizontally?
d) I am connecting Cassandra using nodejs driver by using below code. Do I need to put ip address of all nodes here in code? What If I keep increasing the number of nodes on demand, do I need to change the code every time?
var client = new cassandra.Client({ contactPoints: ['ipaddress_of_node1'], keyspace: 'db1' });
From all above examples you can see that my basic requirement is to store a huge number of records into a single table spreading data to different servers where I should be able to add new servers if data volume increases.
a) If you add new nodes to the data center, the data will be automatically shared between the nodes. With replication factor 1 and default settings, it should be ~50% on each node, though it might take a bit to redistribute data between the nodes after adding a new node. 'nodetool status ' can show you which node owns how much of that keyspace.
b) Yes, I do believe you have to (though not 100% on this).
c) Horizontally with your setup it'll scale linearly (assuming the machines are equal and have the same num_tokens value) by distributing data as according to 1 divided on number of nodes (1 node = 100%, 2 = 50%, 3 = 33%, etc.), both throughput and storage capacity will scale.
d) No, assuming the nodejs driver works like the C++ and Python drivers of Cassandra (it should!), after connecting to Cassandra it'll be aware of the other nodes in the cluster.
Answer by dbrats answers most of your concerns.
Do I need to alter keyspace "db1" to put name of new data centers in the list?
Not needed. You want to alter only if you add a new Data center or change replication factor.
Do I need to put ip address of all nodes here in code?
Not needed. But adding more than one contact point ensure higher availability.
In case your contact point is down, the driver can connect to the other. Once it connects, it can get all the list of nodes.

Resources