Restart read required when migrating existing application from PostgreSQL to YugabyteDB - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
I am trying to migrate an existing application from PostgreSQL to YugabyteDB using a cluster with 3 nodes.
The smoke tests run fine but I received the following error as soon as I use more than one concurrent user:
com.yugabyte.util.PSQLException: ERROR: Query error: Restart read required at: { read: { physical: 1648067607419747 } local_limit: { physical: 1648067607419747 } global_limit: <min> in_txn_limit: <max> serial_no: 0 }
at com.yugabyte.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at com.yugabyte.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
at com.yugabyte.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
at com.yugabyte.jdbc.PgStatement.executeInternal(PgStatement.java:490)
at com.yugabyte.jdbc.PgStatement.execute(PgStatement.java:408)
at com.yugabyte.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:162)
at com.yugabyte.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:151)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
at org.jooq.tools.jdbc.DefaultPreparedStatement.execute(DefaultPreparedStatement.java:219)
at org.jooq.impl.Tools.executeStatementAndGetFirstResultSet(Tools.java:4354)
at org.jooq.impl.AbstractResultQuery.execute(AbstractResultQuery.java:230)
at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:340)
at org.jooq.impl.AbstractResultQuery.fetch(AbstractResultQuery.java:284)
at org.jooq.impl.SelectImpl.fetch(SelectImpl.java:2843)
at org.jooq.impl.DefaultDSLContext.fetch(DefaultDSLContext.java:4749)
I am using the version 11.2-YB-2.13.0.1-b0
It is a clinical data repository implemented using Spring Boot and JOOQ. The application exposes a REST API to store and query clinical documents inside the database.
I try to execute a JMeter test plan that creates and queries random documents using 10 concurrent users during a fixed period (5min).
Until now, we were using PostgreSQL which seems to have Read Committed as the default isolation level. So, I assume that I have to change the isolation at the application level as Spring uses the one defined by the database by default.

Please note that the default isolation level of PostgreSQL and YugabyteDB is not the same.
Read Committed Isolation is supported only if the gflag yb_enable_read_committed_isolation is set to true. By default this gflag is false and in this case the Read Committed isolation level of Yugabyte's transactional layer falls back to the stricter Snapshot Isolation (in which case READ COMMITTED and READ UNCOMMITTED of YSQL also in turn use Snapshot Isolation).
Can you change the isolation level as above for YugabyteDB? Please refer to this doc link for more details - https://docs.yugabyte.com/latest/explore/transactions/isolation-levels/
It should work much better after the change.

Related

Errors persisting after recovering YugabyteDB cluster

[Question posted by a user on YugabyteDB Community Slack]
We’re trying to do a postmortem on an issue we hit in our cluster. It looks like one of our 3 nodes went down and the other two were unable to process requests until it came back. Looking over the logs, I see this message a lot from both before and during the outage:
W0810 00:46:40.740047 3997211 leader_election.cc:285] T 00000000000000000000000000000000 P f65e3577ff4e42a3b935c36a99be1fb9 [CANDIDATE]: Term 7 pre-election: Tablet error from VoteRequest() call to peer df99aaa63d14414785aa9842fcf2fdc1: Invalid argument (yb/tserver/service_util.h:75): RequestConsensusVote: Wrong destination UUID requested. Local UUID: 55065b84a4df41ffac5841463871778a. Requested UUID: df99aaa63d14414785aa9842fcf2fdc1
I0810 00:46:40.740072 3997211 leader_election.cc:244] T 00000000000000000000000000000000 P f65e3577ff4e42a3b935c36a99be1fb9 [CANDIDATE]: Term 7 pre-election: Election decided. Result: candidate lost.
We, unfortunately, lost the logs from the node that went down due to a data loss issue on our side.Also, I’m actually still seeing the messages above even though the cluster has recovered so it looks like we’re still in a state.
What does this mean and does it prevent the cluster from electing a new leader?
The yb-master process recently running on prod-db-us-2 has a UUID of 55065b84a4df41ffac5841463871778a but the yb-master process running on prod-db-us-1 believes that the yb-master on prod-db-us-2 has a UUID of df99aaa63d14414785aa9842fcf2fdc1. This seems like a configuration issue.
My guess is that 55065b84a4df41ffac5841463871778a was originally df99aaa63d14414785aa9842fcf2fdc1. The UUID could change if the data directory is wiped.
You had a loss of data incident on prod-db-us-2 about a month and a half ago so that’s probably when the UUID changed.
Here’s the official documentation for replacing a failed master: https://docs.yugabyte.com/preview/troubleshoot/cluster/replace_master/
Alternatively, you could wipe 55065b84a4df41ffac5841463871778a and create a new yb-master using gflag instance_uuid_override to force it to initialize with uuid df99aaa63d14414785aa9842fcf2fdc1.

ERROR: SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
I am facing the following issue when I try to dump a database in YugabyteDB 2.13.0.1:
[yuga#yugadb-tserver1 ~]$ ./yugabyte-2.13.0.1/postgres/bin/ysql_dump -d ehrbase > ./backups/ehrbase_100.sql
ysql_dump: [archiver (db)] query failed: ERROR: SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction
ysql_dump: [archiver (db)] query was: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE
A similar error can be seen in https://github.com/yugabyte/yugabyte-db/issues/11630 and is the same issue, there is a fix that will come in the immediate next 2.13.* release. The fix is already done, just not yet released.

Can’t read JSON from CDC in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
I am trying to read data in JSON format from CDC (YugabyteDB 2.13) for which I've used the following configuration:
connector.class=io.debezium.connector.yugabytedb.YugabyteDBConnector
database.streamid=88433e52543c4ecdb20934c6135beb3f
database.user=yugabyte
database.dbname=yugabyte
tasks.max=7
database.server.name=dbserver1
database.port=5433
database.master.addresses=<ip>:7100
database.hostname=<hostname>
database.password=yugabyte
table.include.list= sch.test
snapshot.mode=never
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
But I am unable to display data in JSON format. In fact the connector fails due
Schema for table 'sch.test' is missing (io.debezium.connector.yugabytedb.YugabyteDBChangeRecordEmitter:290)
[Worker-070086c4d98efddbd] [2022-04-12 11:04:15,978] ERROR [qorbital-test-json-msk|task-0] Producer failure (io.debezium.pipeline.ErrorHandler:31)
[Worker-070086c4d98efddbd] org.apache.kafka.connect.errors.ConnectException: Error while processing event at offset {transaction_id=null,
Is there a way I can fix it?
So the issue was with the version of YugabyteDB you were using (2.13.0), once it was released, there was a bug related to connector restarts which were causing a NullPointerException as the schema that is cached in the Debezium connector object was not persistent across the restart. So to fix the bug, we have added the logic to send the schema from the server-side every time there’s a restart or whenever the connector requests for the same, this will help in caching the schema and thus would resolve the NPE. Upgrading to a newer version will fix the issue.

Checking for availability of Atlas with Mongoose?

We are creating a NodeJS based solution that makes use of MongoDB, by means of Mongoose. We recently started adding support for Atlas, but we would like to be able to fallback to non-Atlas based queries, when Atlas is not available, for a given connection.
I can't assume the software will be using MongoDB Cloud. Although I could make assumptions based on the URL, I'd still need to have a way to be able to do something like:
const available: boolean = MyModel.connection.isAtlasAvailable()
The reason we want this is because if we make an assumption on Atlas being available and then the client uses a locally hosted MongoDB, the following code will break, since $search is Atlas specific:
const results = await Person.aggregate([
{
$search: {
index: 'people_text_index',
deleted: { $ne: true },
'text': {
'query': filter.query,
'path': {
wildcard: '*'
}
},
count: {
type: 'total'
}
}
},
{
$addFields: {
'mongoMeta': '$$SEARCH_META'
}
},
{ $skip : offset },
{ $limit: limit }
]);
I suppose I could surround this with a try/catch and then fall back to a non-Atlas search, but I'd rather check something is doable before trying an operation.
Is there any way to check whether MongoDB Atlas is available, for a given connection? As an extension to the question, does Mongoose provide a general pattern for checking for feature availability, such as if the connection supports transactions?
I suppose I could surround this with a try/catch and then fall back to a non-Atlas search, but I'd rather check something is doable before trying an operation.
As an isAtlasCluster() check, it would be more straightforward to use a regex match to confirm the hostname in the connection URI ends in mongodb.net as used by MongoDB Atlas clusters.
However, it would also be much more efficient to set a feature flag based on the connection URI when your application is initialised rather than using try/catch within the model on every request (which will add latency of at least one round trip failure for every search request).
I would also note that checking for an Atlas connection is not equivalent to checking if Atlas Search is configured for a collection. If your application requires some initial configuration of search indexes, you may want to have a more explicit feature flag configured by an app administrator or enabled as part of search index creation.
There are a few more considerations depending on the destination cluster tier:
Atlas free & shared tier clusters support fewer indexes so a complex application may have a minimum cluster tier requirement.
Atlas Serverless Instances (currently in preview) does not currently have support for Atlas Search (see Serverless Instance Limitations).
As an extension to the question, does Mongoose provide a general pattern checking for feature availability, such as if the connection supports transactions?
Multi-document transactions are supported in all non-EOL versions of MongoDB server (4.2+) as long as you are connected to a replica set or sharded cluster deployment using the WiredTiger storage engine (default for new deployments since MongoDB 3.2). MongoDB 4.0 also supports multi-document transactions, but only for replica set deployments using WiredTiger.
If your application has a requirement for multi-document transaction support, I would also check that on startup or make it part of your application deployment prerequisites.
Overall this feels like complexity that should be covered by prerequisites and set up of your application rather than runtime checks which may cause your application to behave unexpectedly even if the initial deployment seems fine.

CouchDB v1.7.1 database replication to CouchDB v2.3.0 database fails

In Fauxton, I've setup a replication rule from a CouchDB v1.7.1 database to a new CouchDB v2.3.0 database.
The source does not have any authentication configured. The target does. I've added the username and password to the Job Configuration.
It looks like the replication got stuck somewhere in the process. 283.8 KB (433 documents) are present in the new database. The source contains about 18.7 MB (7215 docs) of data.
When restarting the database, I'm always getting the following error:
[error] 2019-02-17T17:29:45.959000Z nonode#nohost <0.602.0> --------
throw:{unauthorized,<<"unauthorized to access or create database
http://my-website.com/target-database-name/">>}:
Replication 5b4ee9ddc57bcad01e549ce43f5e31bc+continuous failed to
start "https://my-website.com/source-database-name/ "
-> "http://my-website.com/target-database-name/ " doc
<<"shards/00000000-1fffffff/_replicator.1550593615">>:<<"1e498a86ba8e3349692cc1c51a00037a">>
stack:[{couch_replicator_api_wrap,db_open,4,[{file,"src/couch_replicator_api_wrap.erl"},{line,114}]},{couch_replicator_scheduler_job,init_state,1,[{file,"src/couch_replicator_scheduler_job.erl"},{line,584}]}]
I'm not sure what is going on here. From the logs I understand there's an authorization issue. But the database is already present (hence, it has been replicated partially already).
What does this error mean and how can it be resolved?
The reason for this error is that the CouchDB v2.3.0 instance was being re-initialized on reboot. It required me to fill-in the cluster configuration again.
Therefore, the replication could not continue until I had the configuration re-applied.
The issue with having to re-apply the cluster configuration has been solved in another SO question.

Resources