I am provisioning a new datacenter for an existing cluster. A rather shaky VPN connection is hindering me from making a nodetool rebuild bootstrap of the new DC. Interestingly, I have a full fresh database snapshot/backup at the same location as the new DC (transferred outside of the VPN). I am now considering the following approach:
Make sure my clients are using the old DC.
Provision the new nodes in new DC.
ALTER the keyspace to enable replicas on the new DC. This
will start replicating all writes from old DC to new DC.
Before gc_grace_seconds after operation 3) above, use sstableloader to
stream my backup to the new nodes.
For safety precaution, do a full repair.
Would this work?
Our team also faced a similar situation. We run C* on Amazon EC2.
So first we prepared a snapshot of existing nodes and used them to create them for other datacenter(to avoid huge data transfer).
Procedure we followed:
Change replication strategy for all DC1 servers from simple-strategy to networkTopologyStrategy {DC1:x, DC2:y}
change cassandra.yaml
endpoint_snitch: GossipingPropertyFileSnitch
add a DC2 node IP to seeds list
others no need to change
change cassandra-rackdc.properties
dc=DC1
rack=RAC1
restart nodes one at a time.
restart seed node first
Alter the keyspace.
ALTER KEYSPACE keyspace_name WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'DC1' : x, 'DC2':y };
Do it for all keyspace in DC1
no need to repair.
verify if the system is stable by query
Add DC2 servers as new data center to DC1 data center
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
in DC2 db, cassandra.yaml > auto_bootstrap: false
fix seeds, endpoint_snitch, cluster name
Node1 DC1 IP, Node2 DC2 IP as seeds.
recommended endpoint_snitch : GossipingPropertyFileSnitch
cluster name, same as DC1: test-cluster
fix gossiping-property-file-snith : cassandra-rackdc.properties
dc=DC2
rack=RAC1
bring DC2 nodes up one at a time
seed node first
change keyspace to networkTopologyStrategy {DC1:x, DC2:y}
since the DC2 db is copied from DC1, we should repair instead of rebuild
Yes, the approach should work. I've verified it with two knowledgeable people within the Cassandra community. Two pieces that are important to note, however:
That the snapshot is being taken efter the mutations have started being written to the new datacenter.
The backup must be fully imported before gc_grace_seconds after when the backup is taken. Otherwise you risk getting zombie data popping up.
Related
I have a Single Node Cassandra Cluster which has around 44gb of data on it(/var/lib/cassandra/data/my_keyspace). The current storage is 1 tb and I need to migrate all the data to another VM which will have the same setup(single node cluster). My data-node has data being pushed to it every second so I can't afford any downtime(Some sensors are pushing time-series data).
Keyspace :- CREATE KEYSPACE my_keysopace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 43.4 GiB 256 100.0% e0ae36db-f639-430c-91ad-6af3ffb6f906 rack1
After a bit of research I decided it's best to add the new node to existing cluster and then let the old node stream all the data and after streaming is done, decommission the old node.
Source :- https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
Configure old node as seed node for the new node
Add new node node to the ring(auto_bootstrap = true)
Once the status is UN for both nodes, run nodetool cleanup on old node
Decommission the old node
My only concern is will I be facing any data loss/ is this approach appropriate ?
Please let me know if I am missing anything here
Thanks
Firstly, disclaimer, using a single node of C* voids the purpose of the distributed database. Minimal cluster size tends to be 3 so some nodes can go offline without downtime (I'm sure you've seen this warning before). Now with that out the way, let's discuss the process.
Configure old node as seed node for the new node
Yep.
1.5. (Potentially missing step) The step you're missing is the consistency level of your queries needs to be verified. I see you're using replication_factor 1 for all keyspaces in use so make sure you're using a CONSISTENCY_LEVEL of ONE for your queries.
Add new node node to the ring(auto_bootstrap = true)
Sounds good. Make sure you've configured various ports / listen_address etc.
Once the status is UN for both nodes,
Once you reach UN double-check that the client isn't seeing any consistency errors.
3.5. run nodetool cleanup on old node
3.5. (Redundant step) You don't need to run nodetool cleanup. You won't care about left over data from the decommissioned node, as all the data will be moved to the new node replacing it.
Decommission the old node
Yep.
(Missing step) You'll have to modify the new node to see itself as a seed once you've decomissioned the old node or it wont be able to re-start.
Can I prevent a keyspace from syncing over to another datacenter by NOT including the other datacenter in my keyspace replication definition?
Apparently, this is not the case.
In my own test, I have set up two Kubernetes clusters in GCP, each serves as a Cassandra datacenter. Each k8s clusters have 3 nodes.
I set up datacenter DC-WEST first, and create a keyspace demo using this:
CREATE KEYSPACE demo WITH replication = {‘class’: ‘NetworkTopologyStrategy’, ‘DC-WEST’ : 3};
Then I set up datacenter DC-EAST, without adding any use keyspaces.
To join the two data centers, I modify the CASSANDRA_SEEDS environment variable in the Cassandra StatefulSet YAML to include seeds nodes from both datacenters (I use host networking).
But after that, I notice the keyspace demo is synced over to DC-EAST, even though the keyspace only has DC-WEST in the replication.
cqlsh> select data_center from system.local
... ;
data_center
-------------
DC-EAST <-- Note: this is from the DC-EAST datacenter
(1 rows)
cqlsh> desc keyspace demo
CREATE KEYSPACE demo WITH replication = {'class': 'NetworkTopologyStrategy', 'DC-WEST': '3'} AND durable_writes = true;
So we see in DC-EAST the demo keyspace which should be replicated only on DC-WEST! What am I doing wrong?
Cassandra replication strategies control where data is placed, but the actual schema (the existence of the table/datacenters/etc) is global.
If you create a keyspace that only lives in one DC, all other DCs will still see the keyspace in their schema, and will even make the directory structure on disk, though no data will be replicated to those hosts.
You didn't specify how you deployed you Cassandra cluster in Kubernetes, but it looks like your nodes in DC-WEST may be configured to say that they are DC-EAST.
I would check the ConfigMap for the stateful set in DC-WEST. Maybe it has the DC-EAST value for cassandra-rackdc.properties(?). More info on the cassandra-rackdc.properties file here.
I have two docker cassandra container nodes acting as node1 and node2 in the same data center.
My aim is to have my java application will always connect to node1 and my adhoc manual queries should return from node2 only (There should not be any inter node communication for data)
Normally i can execute read/write queries on top of container1 or container2 using cqlsh. If i fire some queries on top of container1 using cqlsh will it always return the data from same container (node1) or it may route to another node also internally?
and I know coordinator node will talk with peer node for data request , what will happen incase of RF=2 and 2 nodes cluster will coordinator node itself be able to serve the data?
Here, RF=2, node=2, Consistency=ONE
I have set up clusters before to separate OLTP from OLAP. The way to do it, is to separate your nodes into different logical data centers.
So node1 should have it's local data center in cassandra-rackdc.properties to be in "dc1."
dc=dc1
rack=r1
Likewise, node2 should be put into it's own data center, "dc2."
dc=dc2
rack=ra
Then your keyspace definition will look something like this:
CREATE KEYSPACE stackoverflow
WITH REPLICATION={'class':'NetworkTopologyStrategy','dc1':'1','dc2':'1'};
My aim is to have my java application will always connect to node1
In your Java code, you should specify "dc1" as your default data center, as I do in this example:
String dataCenter = "dc1";
Builder builder = Cluster.builder()
.addContactPoints(nodes)
.withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_ONE))
.withLoadBalancingPolicy(new TokenAwarePolicy(
new DCAwareRoundRobinPolicy.Builder()
.withLocalDc(dataCenter).build()))
.withPoolingOptions(options);
That will make your Java app "sticky" to all nodes in data center "dc1," or just node1 in this case. Likewise, when you cqlsh into node2, your ad-hoc queries should be "sticky" to all nodes in "dc2."
Note: In this configuration, you do not have high-availability. If node1 goes down, your app will not jump over to node2.
I have a 5 node cassandra cluster with RF=3 (only for application related db) with only 1 data centre. I wish to change password of default cassandra user
My system_auth key space has the following setting
CREATE KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
Questions
Should the strategy by changed to NetworkTopology ? I thought it
wasn't required as there is only 1 DC
Should the RF be 3 same as for other application related DB ?
When I change the credentials of default cassandra user using ALTER
USER command, should I change it in each of the hosts since currently RF=1 ?
Should the strategy by changed to NetworkTopology ? I thought it
wasn't required as there is only 1 DC
Since its a single data center, simple strategy should work fine. Consider changing to Network Topology strategy when going Multi-DC
Should the RF be 3 same as for other application related DB ?
Its definitely recommended to have system_auth keyspace RF to be more than 1. Having RF=1 entails only one copy of the storing user credentials and hence any particular node loss would cause loss of a portion of authorization data. Increase it to a minimum of 3.
When I change the credentials of default cassandra user using ALTER USER command, should I change it in each of the hosts since currently RF=1 ?
No its not required to change in each node. With RF=1, the user credentials of "Cassandra" would live in only node. Irrespective of which node you pick to change its password, it will act as a coordinator and route the password change to appropriate node storing Cassandra user. Again if you loose that node which stored Cassandra, you potentially lost access to the cluster. So having RF=3, will avoid that.
As part of a POC. I have a 1 node cassandra cluster with system_auth keyspace RF=1.
I added a second node to this cluster(with empty data/commitlog/saved_cache directory) and I notice user credentials are replicated to the new node. Since RF=1 for existing node, I don't expect it to replicate to new node.
Any reason why ?
Cassandra Version : 2.1.8
For most system_auth queries, Cassandra uses a consistency level of LOCAL_ONE and uses QUORUM for the default cassandrasuperuser. If both nodes are up, you will be able to see the data and login without any problem. Also you added second node with empty commit log and saved caches, but if you copied the rest of the data form the original node the data will be there, including system_auth.