Yugabyte replication and clustring per Database or Keyspace - yugabytedb

Is it possible to config replication factor and clustering for each database(or keyspace) or we must just config it for hole yugabyte instance?

At the moment, replication factor is per cluster/universe, and cannot be specified at the database/keyspace level. Could you please file a GitHub issue for this enhancement? It is something that was planned but we did not get around to it.
Not sure what you mean by clustering... all databases/keyspaces and their replicas in a node are allowed to use all nodes in the cluster. Could you please explain what you're trying to achieve?

Related

cassandra 3.11.9 system_auth need to be SimpleStrategy or NetworkTopologyStrategy on production env?

What is the recommended for cassandra (apache) 3.11.9 system_auth? need to be SimpleStrategy or NetworkTopologyStrategy? And with how much RF?
We have cassandra with 1 dc (2-3 AWS racks with EC2_snitch + dynamic_snitch disabled). Most queries running on consistency level local_one). Today our system_auth keyspace configured SimpleStrategy with RF 3. In a lot of queries, we are wasting time on (tracing):
Executing single-partition query on roles [ReadStage-X]
As part of an attempt to solve our problems we also increased the parameters:
roles_validity_in_ms, permissions_validity_in_ms, credentials_validity_in_ms, permissions_cache_max_entries.
Can queries latency problems be connected to system_auth keyspace configuration?
I answered this question a while ago, which is similar:
Replication Factor to use for system_auth
Due to issues that can happen with larger clusters which fluctuate in size, we now treat system_auth like we do any other keyspace. That is, we set system_auth's RF to 3 in each DC.
tl;dr;, if you're using NetworkTopologyStrategy on your non-system keyspaces, then you should also be using it for system_auth. Same with your RF; I'd always match the RF of system_auth with that of my "normal" keyspaces, as well.
No, the replication strategy and RF used on system_auth does not typically cause query latency. That is of course, unless any of the Security cache settings have been altered. 10 years of working with Cassandra, I've never had to change those: https://docs.datastax.com/en/security/5.1/security/secAuthCacheSettings.html
queries wasting time on (tracing): "Executing single-partition query on roles [ReadStage-X]"
This statement got me thinking: Are you tracing queries in cqlsh while logged in as the default cassandra user? That user does trigger some cqlsh operations to execute at QUORUM. Could also be that maybe the query consistency and connection consistency are set differently. Just a thought.

Repair system_auth keyspace in Cassandra

According to official documents, system keyspace uses Local replication strategy so there is no need to repair it, my question is about system_auth keyspace, should I manually run repair on this keyspace?
When I use full repair without specifying any keyspace, I expect to see system_auth being repaired in the log file, but I can't see any indication that system_auth is getting repaired.
Only some system keyspaces are using the local replication strategy. The system_auth uses SimpleStrategy with replication factor 1 by default (see docs). If you have a cluster of several nodes, then it's recommended to set replication strategy to NetworkTopologyStrategy (even if you have one DC - it will help in the future) and increase replication factor to 3 in each DC. And then you need to run repairs on it to have it in consistent state.
P.S. Also, create a new superuser (see step 5 in docs), because default cassandra uses QUORUM when reading login data, and it could be a problem if you lose half of the machines.

CouchDB replication to cluster

I'm trying to set up a single CouchDB node with a primary copy of a database and have it replicate (one way) to a three-node CouchDB cluster. I want to do this for HA and performance, the users would talk to the read-only cluster.
This setup doesn't seem to work, no matter what I try the replication always gets an authorization error. I'm 100% sure the password is correct. Indeed, I can't even seem to set up replication between one database and another within the cluster. All four nodes are running 2.3.0.
Is replication not compatible with clustering?
This does work, I found out what the issue was. I was specifying the clear-text admin password in the cluster configuration and each node was generating a hash for it with a different salt. The solution was to specify the same hashed value in the config file for all clustered nodes.

Creating Cassandra sub-clusters

I need to create K overlapping Cassandra clusters on N machines (K>>N). Each cluster can have between 1 to N nodes. I know that one way of doing so is to create a separate process (or docker container) for each cluster a node is a member of.
My question however is that can I change Cassandra to allow the creation of sub-clusters? meaning that there would be only 1 Cassandra instance running on each node, but I would be able to take control of data replication and data placement so that within a sub-cluster for example, I would be able to do a Quorum write for example.
No, it's not possible to define the sub-cluster as you describe - there is always a single Cassandra cluster per process.
But Cassandra has a notion of the Datacenter that defines where machine resides, and the keyspace that defines how the data is replicated between datacenters and nodes. And consistency level, like, QUORUM depends on the keyspace configuration.
In your case I would think in that direction - define datacenters, create necessary keyspaces, and setup correct replication factors for that keyspaces.

Adding new keyspace in existing production cassandra cluster

I" have an existing cassandra cluster running in AWS. It has total 6 nodes in the same data center but in multiple regions. We are using cassandra version 2.2.8 in production. There are two existing keyspaces already present in the production environment. I want to add a new keyspace to the production cluster.
I am new to Cassandra so looking for following answers:
Can I add new keyspace in the existing production cluster without taking the cluster down?
Any best practices you would recommend to add the new keyspace to the existing cluster.
Possible steps to add new Keyspace?
I really appreciate your help!
Yes, you can add keyspaces online.
When you add a keyspace, you have to choose the Replication Factor. As you have AWS Multi Region, probably you are using Ec2MultiRegionSnitch as endpoint_snitch, right?
If you do, probably you configured dc_suffix=_XYZ and now you have your DCs like this: "us-east_XYZ" (See on nodetool status).
Then, you can use something like this:
CREATE KEYSPACE my_keysace
WITH REPLICATION = {
'class' : 'NetworkTopologyStrategy','us-east_XYZ' : 2, 'us-west_XYZ':2 }
AND DURABLE_WRITES = true
See docs: CREATE KEYSPACE

Resources