Different versions of read-replica and primary-cluster in YugabyteDB

Different versions of read-replica and primary-cluster in YugabyteDB - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
Is it possible to have the primary data cluster and read replica be at different versions?
primary data # 2.1 and replicate cluster # 2.8

Ideally, no. During software upgrades, temporarily different nodes can be on different releases. But in a steady state, the read-replica nodes should be the same version as the primary cluster.
With xCluster, where the two clusters are really independent clusters linked with an async replication channel, they can be on different releases for extended periods of time.

Related

Using read-replicas in YugabyteDB for DR

[Question posted by a user on YugabyteDB Community Slack]
I’m trying to understand the tolerance failure in YugabyteDB.
My scenario is as follows:
Universe is setup with primary data cluster and 1 read replica cluster, max_stale_read_bound_time_ms = 60.
And the primary data cluster got wipe out (lost all data).
Questions:
Would we be able to rebuild the primary data cluster with the read replica cluster?
Can the read replica cluster become the primary data cluster?

The answer is no to both.
xCluster is what you want to use for DR.
The design point for read-replicas in YugabyteDB is not for DR, but rather to bring data closer to where data is being read from. And read replicas have no yb-master. Without a yb-master, one cannot read from it.

How to distribute yb-masters in multi-region deployment in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
In terms of numbers of yb-master, they should be as many as the replication factor. My question is, is having master and servers running in all the nodes a bad policy?
And, if we have a multidc deployment, should we have at least 1 master on each dc?
I guess the best is to accommodate the leader of yb-master in DC, which is going to be the main workload (if there is any) right?

It's perfectly normal to colocate an yb-tserver and yb-master in the same server. But in large deployments, it's better for them to be on separate servers for splitting workloads (so heavy usage on yb-tserver won't interfere with yb-master).
And, if you have a multidc deployment, then you should deploy 1 in each region, so that you have region failover for the yb-masters too.
For a situation with YB to be usable, you have to have 2 out of 3 masters, so indeed with a 2DC situation, you cannot build a situation where you always have availability, because you have to have 2 masters at one place and 1 at the other. So the only solution for high availability is 3DC.
Do 3DC with the same number of nodes in each DC, so you will end up with a total of 3, 6, 9, etc. nodes.A master should be in each D.C., if not you will again lose resilience.
I guess the best is to accommodate the leader of yb-master in DC,
which is going to be the main workload (if there is any) right?
In this case you can set 1 region/zone as the preferred one and the database will try to put leaders there automatically using set-preferred-zones https://docs.yugabyte.com/latest/admin/yb-admin/#set-preferred-zones

Restoring data in a cluster with different number of nodes in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
If I need to restore snapshots from one cluster with(4 nodes) to another cluster with(3 nodes).
How will we do that,as per documentation?
Data of 1st node should be restored on the 1st node of the other cluster.
Similarly for 2 more nodes we can do.
What will we do with the data of the remaining one node of 1st cluster?

Data is snapshotted/restored per-tablet. The new cluster will distribute the tablets on fewer nodes in your case and each node will have more tablets. You just have to go over all the nodes and restore each tablet according to how they were distributed when you imported the snapshot file. Doesn't matter the number of nodes.
More details at: https://docs.yugabyte.com/latest/manage/backup-restore/snapshot-ysql/

Set wal_level = logical instead of replica in YugabyteDB?

[Question posted by a user on YugabyteDB Community Slack]
Is there a way to set wal_level = logical instead of replica in YugabyteDB? I’m looking to implement CDC so I can synchronize other systems with the changes.

Note that YugabyteDB doesn’t reuse the PostgreSQL storage engine & replication mechanism, but has built its own. And it doesn’t support setting wal_level=logical.
Currently, CDC support is in the works https://github.com/yugabyte/yugabyte-db/issues/9019.

Difference between Yugabyte's YCQL & YSQL

Can I get some help in understanding the real difference between YSQL vs YCQL? Based on the documentation I understand that the current implementation of underlying storage for YUGABYTE is DOCS DB and uses RAFT for replication.
Based on this can I assume that YSQL vs YCQL the only difference is that we have triggers, stored procs? and SQL features in YSQL and not in YCQL?

Great question. The plan is that over time YSQL will have most of the features in YCQL, but that is not the case today. This is because there is significant work left to be done in YSQL to achieve parity, some of which is already in progress.
YSQL features
YSQL re-uses the upper half of PostgreSQL with a horizontally scalable lower half called DocDB. Thus, YSQL would support all PostgreSQL features - including stored procedures, triggers, common table expressions, extensions and foreign data wrappers (last feature is not yet done).
YCQL features not in YSQL
Here is a list of YCQL features not in YSQL.
Cluster awareness: The client drivers are cluster aware, meaning the clients can discover all nodes of the cluster given just one contact point. These client drivers also get notified of node add/remove and therefore apps do not need a load balancer to use a distributed cluster. There is on-going work to incorporate this functionality into YSQL as a part of jdbc-yugabytedb project.
Topology awareness: The client drivers are also topology aware, meaning they are notified of the regions/zones in which the various nodes of the cluster are deployed. They can perform operations such as reading from nearest region/datacenter.
Automatic data expiry: YCQL supports automatic expiry of data using the TTL feature - you can set a retention policy for data at a table or row level and the older data is automatically purged from the DB.
Collection data types: YCQL supports collection data types such as sets, maps, lists. Note that both YCQL and YSQL support JSONB which can be used to model the above though.
Cassandra API compatible YCQL is Cassandra API compatible, and therefore supports the Cassandra ecosystem projects. Examples include Spark and Kafka connectors, JanusGraph and KairosDB support, etc. Note that while these ecosystem integrations can be built on top of YSQL, it does not exist today and is a matter of prioritization.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string