we are trying to add a node to the existing ring where in security is enabled and default cassandra user is made nonsuper. Also, alerted keyspace to networktopology with replication = no.of nodes. The ring is currently on AWS.
Once the new node joins the cluster, only user we see is nonsuper cassandra user. we are pretty much lokced out of the cluster. However, once we remove the newly joined node, all the security that we had before comes back.
Are there any best practices that we need to follow to enable security in 3.9?
Thanks in advance for helping me out on this.!!
Related
We are trying to migrate our ScyllaDB cluster deployed on GCE machines to the GKE cluster in Google Cloud, we came across one approach of Cassandra migration and want to implement the same here in ScyllaDB migration. Below is the link for the same, can you please suggest if this is possible in Scylla ?
or if Scylla hasn't introduced such a migration technique with the Scylla K8S operator ?
https://k8ssandra.io/blog/tutorials/cassandra-database-migration-to-kubernetes-zero-downtime/
Adding a new "destination" DC to your existing cluster "source" DC, is a very common technic to migrate to a new DC.
Add the new "destination" DC
Change replication factor settings accordingly
nodetool rebuild --> stream data from the "source" DC to the "destination" DC
nodetool repair the new DC.
Update your application clients to connect to the new DC once it's ready to serve (all data streamed + repaired)
Decommission the "old" (source) DC
For the gory details see here:
https://docs.scylladb.com/stable/operating-scylla/procedures/cluster-management/add-dc-to-existing-dc.html
https://docs.scylladb.com/stable/operating-scylla/procedures/cluster-management/decommissioning-data-center.html
If you prefer to go the full scan route. CQL reads on the source and CQL writes on the destination, with some ability for data manipulation and save points to resume from, than the Scylla Spark Migrator is a good option.
https://github.com/scylladb/scylla-code-samples/tree/master/spark-scylla-migrator-demo
You can also use the Scylla Spark migrator to migrate parquet files
https://www.scylladb.com/2020/06/10/migrate-parquet-files-with-the-scylla-migrator/
Remember not to migrate Materialized views (MV), you can always re-create them post migration again from the base tables.
We use an Apache Spark-based Migrator: https://github.com/scylladb/scylla-migrator
Here's the blog we wrote on how to do this back in 2019: https://www.scylladb.com/2019/02/07/moving-from-cassandra-to-scylla-via-apache-spark-scylla-migrator/
Though in this case, you aren't moving from Cassandra to ScyllaDB; just moving from one ScyllaDB instance to another. If this makes sense to you, it should be straight forward. If you have questions, feel free to join our Slack community to get more interactive assistance:
http://slack.scylladb.com/
I'm trying to set up a single CouchDB node with a primary copy of a database and have it replicate (one way) to a three-node CouchDB cluster. I want to do this for HA and performance, the users would talk to the read-only cluster.
This setup doesn't seem to work, no matter what I try the replication always gets an authorization error. I'm 100% sure the password is correct. Indeed, I can't even seem to set up replication between one database and another within the cluster. All four nodes are running 2.3.0.
Is replication not compatible with clustering?
This does work, I found out what the issue was. I was specifying the clear-text admin password in the cluster configuration and each node was generating a hash for it with a different salt. The solution was to specify the same hashed value in the config file for all clustered nodes.
I'd like to enable vnodes on my cassandra cluster, which has an Analytics dc and a regular Cassandra dc. I am using OpsCenter 5.0.1 and DSE 4.5. My question is: how can I create a new dc with OpsCenter, with vnodes enabled, so I can transfer my data over from my existing dc's. I am following the instructions on this page, but surely I don't have to manually edit the config file on every node, to enable a new datacenter, right? Any help much appreciated.
Unfortunately OpsCenter's automated provisioning doesn't currently support creating multi-dc clusters or adding data centers to existing clusters. We know this is important functionality that's missing, and are working on making that available as soon as we can.
I have a Cassandra cluster (Datastax open source) and currently there is no authentication configured (i.e., it is using AllowAllAuthenticator), and I want to use PasswordAuthenticator. The official document says that I should follow these steps:
enable PasswordAuthenticator in cassandra.yaml,
restart the Cassandra node, which will create the system_auth keyspace,
change the system_auth replication factor,
create new user and password
However, this is a big problem to me because the cluster is used in production so we cannot have any downtime. Between step 2 and 4 no user has been configured yet, so even if the client supplies username and password, the request would still be rejected, which is not ideal.
I looked into the Datastax Enterprise doc, and it has a TransitionalAuthenticator class, which would create the system_auth keyspace but without rejecting requests. I wonder if this class can be ported to the open source version? Or if there are other ways around this problem? Thanks
Update
This is the Cassandra version I'm using:
cqlsh 4.1.1 | Cassandra 2.0.9 | CQL spec 3.1.1 | Thrift protocol 19.39.0
You should be able to execute steps 2-4 with just one node and have zero downtime, assuming proper client configuration, replication, and cluster capacity. Then, it's just a rolling restart of the remaining nodes.
Clients should be setup with credentials ahead of time, and they will start using them as nodes as nodes with authorizers come online (this behavior could depend on driver -- try it out first).
You might be able to manually generate the schema and data for steps 3-4 before engaging the CassandraAuthenticator, but that shouldn't be necessary.
What are your concerns about downtime?
Is there a possibility to write to a particular node using datastax driver?
For example, I have three nodes in datacenter 1 and three nodes in datacenter 2.
Existing
If i build up the cluster with any one of them as seed, all the nodes will get detected by the datastax java driver. So, in this case, if i insert a data using driver, it will automatically choose one of the nodes and proceed with it as the co-ordinator(preferably local data center)
Requirement
I want a way to contact any node in datacenter 2 and hand over the co-ordinator job to one of the nodes in datacenter 2.
Why i need this
I am trying to use the trigger functionality from datacenter 2 alone. Since triggers are taken care by co-ordinator , i want a co-ordinator to be selected from datacenter 2 so that data center 1 doesnt have to do this operation.
You may be able to use the DCAwareRoundRobinPolicy load balancing policy to achieve this by creating the policy such that DC2 is considered the "local" DC.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("dc2"));
In the above example, remote (non-DC2) nodes will be ignored.
There is also a new WhiteListPolicy in driver version 2.0.2 that wraps another load balancing policy and restricts the nodes to a specific list you provide.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new WhiteListPolicy(new DCAwareRoundRobinPolicy("dc2"), whiteList));
For multi-DC scenarios Cassandra provides EACH and LOCAL consistency levels where EACH will acknowledge successful operation in each DC and LOCAL only in local one.
If I understood correctly, what you are trying to achieve is DC failover in your application. This is not a good practice. Let's assume your application is hosted in DC1 alongside with Cassandra. If DC1 goes down, your entire application is unavailable. If DC2 goes down, your application still can write with LOCAL CL and C* will replicate changes when DC2 is back.
If you want to achieve HA, you need to deploy application in each DC, use CL=LOCAL_X and finally do failover on DNS level (e.g. using AWS Route53).
See data consistency docs and this blog post for more info about consistency levels for multiple DCs.