Can’t read JSON from CDC in YugabyteDB - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
I am trying to read data in JSON format from CDC (YugabyteDB 2.13) for which I've used the following configuration:
connector.class=io.debezium.connector.yugabytedb.YugabyteDBConnector
database.streamid=88433e52543c4ecdb20934c6135beb3f
database.user=yugabyte
database.dbname=yugabyte
tasks.max=7
database.server.name=dbserver1
database.port=5433
database.master.addresses=<ip>:7100
database.hostname=<hostname>
database.password=yugabyte
table.include.list= sch.test
snapshot.mode=never
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
But I am unable to display data in JSON format. In fact the connector fails due
Schema for table 'sch.test' is missing (io.debezium.connector.yugabytedb.YugabyteDBChangeRecordEmitter:290)
[Worker-070086c4d98efddbd] [2022-04-12 11:04:15,978] ERROR [qorbital-test-json-msk|task-0] Producer failure (io.debezium.pipeline.ErrorHandler:31)
[Worker-070086c4d98efddbd] org.apache.kafka.connect.errors.ConnectException: Error while processing event at offset {transaction_id=null,
Is there a way I can fix it?

So the issue was with the version of YugabyteDB you were using (2.13.0), once it was released, there was a bug related to connector restarts which were causing a NullPointerException as the schema that is cached in the Debezium connector object was not persistent across the restart. So to fix the bug, we have added the logic to send the schema from the server-side every time there’s a restart or whenever the connector requests for the same, this will help in caching the schema and thus would resolve the NPE. Upgrading to a newer version will fix the issue.

Related

Errors persisting after recovering YugabyteDB cluster

[Question posted by a user on YugabyteDB Community Slack]
We’re trying to do a postmortem on an issue we hit in our cluster. It looks like one of our 3 nodes went down and the other two were unable to process requests until it came back. Looking over the logs, I see this message a lot from both before and during the outage:
W0810 00:46:40.740047 3997211 leader_election.cc:285] T 00000000000000000000000000000000 P f65e3577ff4e42a3b935c36a99be1fb9 [CANDIDATE]: Term 7 pre-election: Tablet error from VoteRequest() call to peer df99aaa63d14414785aa9842fcf2fdc1: Invalid argument (yb/tserver/service_util.h:75): RequestConsensusVote: Wrong destination UUID requested. Local UUID: 55065b84a4df41ffac5841463871778a. Requested UUID: df99aaa63d14414785aa9842fcf2fdc1
I0810 00:46:40.740072 3997211 leader_election.cc:244] T 00000000000000000000000000000000 P f65e3577ff4e42a3b935c36a99be1fb9 [CANDIDATE]: Term 7 pre-election: Election decided. Result: candidate lost.
We, unfortunately, lost the logs from the node that went down due to a data loss issue on our side.Also, I’m actually still seeing the messages above even though the cluster has recovered so it looks like we’re still in a state.
What does this mean and does it prevent the cluster from electing a new leader?
The yb-master process recently running on prod-db-us-2 has a UUID of 55065b84a4df41ffac5841463871778a but the yb-master process running on prod-db-us-1 believes that the yb-master on prod-db-us-2 has a UUID of df99aaa63d14414785aa9842fcf2fdc1. This seems like a configuration issue.
My guess is that 55065b84a4df41ffac5841463871778a was originally df99aaa63d14414785aa9842fcf2fdc1. The UUID could change if the data directory is wiped.
You had a loss of data incident on prod-db-us-2 about a month and a half ago so that’s probably when the UUID changed.
Here’s the official documentation for replacing a failed master: https://docs.yugabyte.com/preview/troubleshoot/cluster/replace_master/
Alternatively, you could wipe 55065b84a4df41ffac5841463871778a and create a new yb-master using gflag instance_uuid_override to force it to initialize with uuid df99aaa63d14414785aa9842fcf2fdc1.

ERROR: SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
I am facing the following issue when I try to dump a database in YugabyteDB 2.13.0.1:
[yuga#yugadb-tserver1 ~]$ ./yugabyte-2.13.0.1/postgres/bin/ysql_dump -d ehrbase > ./backups/ehrbase_100.sql
ysql_dump: [archiver (db)] query failed: ERROR: SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction
ysql_dump: [archiver (db)] query was: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE
A similar error can be seen in https://github.com/yugabyte/yugabyte-db/issues/11630 and is the same issue, there is a fix that will come in the immediate next 2.13.* release. The fix is already done, just not yet released.

Unable to import snapshot meta file in YugabyteDB, table not found

[Question posted by a user on YugabyteDB Community Slack]
I’m getting the following error while importing the snapshot:
Error running import_snapshot: Invalid argument (yb/master/catalog_manager_ent.cc:1315): Unable to import snapshot meta file FOOBAR.snapshot: YSQL table not found: notes: OBJECT_NOT_FOUND (master error 3)
I am following this document - https://docs.yugabyte.com/preview/manage/backup-restore/snapshot-ysql/#restore-a-snapshot
After carefully reading the error, I realized it's failing for the table notes.
Schema import was failing because tservers were not distributed properly across the available AZs, once that got fixed.. it worked.

Restart read required when migrating existing application from PostgreSQL to YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
I am trying to migrate an existing application from PostgreSQL to YugabyteDB using a cluster with 3 nodes.
The smoke tests run fine but I received the following error as soon as I use more than one concurrent user:
com.yugabyte.util.PSQLException: ERROR: Query error: Restart read required at: { read: { physical: 1648067607419747 } local_limit: { physical: 1648067607419747 } global_limit: <min> in_txn_limit: <max> serial_no: 0 }
at com.yugabyte.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at com.yugabyte.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
at com.yugabyte.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
at com.yugabyte.jdbc.PgStatement.executeInternal(PgStatement.java:490)
at com.yugabyte.jdbc.PgStatement.execute(PgStatement.java:408)
at com.yugabyte.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:162)
at com.yugabyte.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:151)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
at org.jooq.tools.jdbc.DefaultPreparedStatement.execute(DefaultPreparedStatement.java:219)
at org.jooq.impl.Tools.executeStatementAndGetFirstResultSet(Tools.java:4354)
at org.jooq.impl.AbstractResultQuery.execute(AbstractResultQuery.java:230)
at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:340)
at org.jooq.impl.AbstractResultQuery.fetch(AbstractResultQuery.java:284)
at org.jooq.impl.SelectImpl.fetch(SelectImpl.java:2843)
at org.jooq.impl.DefaultDSLContext.fetch(DefaultDSLContext.java:4749)
I am using the version 11.2-YB-2.13.0.1-b0
It is a clinical data repository implemented using Spring Boot and JOOQ. The application exposes a REST API to store and query clinical documents inside the database.
I try to execute a JMeter test plan that creates and queries random documents using 10 concurrent users during a fixed period (5min).
Until now, we were using PostgreSQL which seems to have Read Committed as the default isolation level. So, I assume that I have to change the isolation at the application level as Spring uses the one defined by the database by default.
Please note that the default isolation level of PostgreSQL and YugabyteDB is not the same.
Read Committed Isolation is supported only if the gflag yb_enable_read_committed_isolation is set to true. By default this gflag is false and in this case the Read Committed isolation level of Yugabyte's transactional layer falls back to the stricter Snapshot Isolation (in which case READ COMMITTED and READ UNCOMMITTED of YSQL also in turn use Snapshot Isolation).
Can you change the isolation level as above for YugabyteDB? Please refer to this doc link for more details - https://docs.yugabyte.com/latest/explore/transactions/isolation-levels/
It should work much better after the change.

How to disable 'spark.security.credentials.${service}.enabled' in Structured streaming while connecting to a kafka cluster

I am trying to read data from a secured Kafka cluster using spark structured streaming.
Also I am using the below library to read the data - "spark-sql-kafka-0-10_2.12":"3.0.0-preview" since it has the feature to specify our custom group id (instead of spark setting its own custom group id)
Dependency used in code:
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.12</artifactId>
<version>3.0.0-preview</version>
I am getting the below error - even after specifying the required JAAS configuration in spark options.
Caused by: java.lang.IllegalArgumentException: requirement failed: Delegation token must exist for this connector.
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.kafka010.KafkaTokenUtil$.isConnectorUsingCurrentToken(KafkaTokenUtil.scala:299)
at org.apache.spark.sql.kafka010.KafkaDataConsumer.getOrRetrieveConsumer(KafkaDataConsumer.scala:533)
at org.apache.spark.sql.kafka010.KafkaDataConsumer.$anonfun$get$1(KafkaDataConsumer.scala:275)
Following document specifies that we can disable the feature of obtaining delegation token - https://spark.apache.org/docs/3.0.0-preview/structured-streaming-kafka-integration.html
I tried setting this property spark.security.credentials.kafka.enabled to false in spark config, but it is still failing with the same error.
Apparently there seems to be a bug on the preview release and has been fixed on the GA Spark 3.x release.
Reference :
https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-30495
Now, we can specify our custom consumer group name while fetching the data from Kafka (Even though it's not recommended and we will see a warning message while specifying it).

Resources