Getting "Already present: Duplicate request" errors in YugabyteDB YSQL - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
Sending a lot of concurrent requests in the YSQL layer, we're getting Already present: Duplicate request, XX000 errors. Are these safe to retry ?

Here’s a in-progress list of the retryable cases:
If you get Duplicate Request in the middle of transaction, then it could be safely retried.
If Duplicate Request was generated by standalone statement, then it
is possible that original statement was executed and applied to db.
So safety depends on the query and application.
Catalog Version Mismatch — this is really a txn conflict but specifically with a DDL. There’s already an issue to change the error code for this: https://github.com/yugabyte/yugabyte-db/issues/8597.
Any error with Try Again prefix — but these might already be re-mapped/retried internally.
Leader not ready to serve requests. and Leader does not have a valid lease. are also retryable.
TimedOut requests are also retryable in some cases (e.g. for pure reads and operations in transction blocks), but not safe for single-row writes.
Generally for non-XX000 error codes the same rules apply as for vanilla Postgres. (e.g. 40001 is retryable — and we should already map YB transaction errors and read restart required errors to that error code).
For XX000 (internal error), there are a number of specific errors that should be safe to retry.
Internally we already re-map some of the errors to YSQL/PG error codes (like mentioned above) and generally we aim to do that appropriately.
The full list of internal error codes are at: https://github.com/yugabyte/yugabyte-db/blob/master/src/yb/util/status.h#L149

Related

Dealing with Azure Cosmos DB cross-partition queries in REST API

I'm talking to Cosmos DB via the (SQL) REST API, so existing questions that refer to various SDKs are of limited use.
When I run a simple query on a partitioned container, like
select value count(1) from foo
I run into a HTTP 400 error:
The provided cross partition query can not be directly served by the gateway. This is a first chance (internal) exception that all newer clients will know how to handle gracefully. This exception is traced, but unless you see it bubble up as an exception (which only
happens on older SDK clients), then you can safely ignore this message.
How can I get rid of this error? Is it a matter of running separate queries by partition key? If so, would I have to keep track of what the existing key values are?

What can cause "idle in transaction" for "BEGIN" statements

We have a node.js application that connects via pg-promise to a Postgres 11 server - all processes are running on a single cloud server in docker containers.
Sometimes we hit a situation where the application does not react anymore.
The last time this happened, I had a little time to check the db via pgadmin and it showed that the connections were idle in transaction with statement BEGIN and an exclusive lock of virtualxid
I think the situation is like this:
the application has started a transaction by sending the BEGIN sql command to the db
the db got this command and started a new transaction and thus acquired an exclusive lock of mode virtualxid
now the db waits for the application to send the next statement/s (until it receives COMMIT or ROLLBACK) - and then it will release the exclusive lock of mode virtualxid
but for some reason it does not get anymore statements:
I think that the node.js event-loop is blocked - because at the time, when we see these locks, the node.js application does not log anymore statements. But the webserver still gets requests and reported some upstream timed out requests.
Does this make sense (I'm really not sure about 2. and 3.)?
Why would all transactions block at the beginning? Is this just coincidence or is the displayed SQL maybe wrong?
BTW: In this answer I found, that we can set idle_in_transaction_session_timeout so that these transactions will be released after a timeout - which is great, but I try to understand what's causing this issue.
The transactions are not blocking at all. The database is waiting for the application to send the next statement.
The lock on the transaction ID is just a technique for transactions to block each other, even if they are not contending for a table lock (for example, if they are waiting for a row lock): each transaction holds an exclusive lock on its own transaction ID, and if it has to wait for a concurrent transaction to complete, it can just request a lock on that transaction's ID (and be blocked).
If all transactions look like this, then the lock must be somewhere in your application; the database is not involved.
When looking for processes blocked in the database, look for rows in pg_locks where granted is false.
Your interpretation is correct. As for why it is happening, that is hard to say. It seems like there is some kind of bug (maybe an undetected deadlock) in your application, or maybe in nodes.js or pg-promise. You will have to debug at that level.
As expected the problems were caused by our application code. Transactions were used incorrectly:
One of the REST endpoints started a new transaction right away, using Database.tx().
This transaction was passed down multiple levels, but one function in the chain had an error and passed undefined instead of the transaction to the next level
the lowest repository level function started a new transaction (because the transaction parameter was undefined), by using Database.tx() a second time
This started to fail, under heavy load:
The connection pool size was set to 10
When there were many simultaneous requests for this endpoint, we had a situation where 10 of the requests started (opened the outer transaction) and had not yet reached the repository code that will request the 2nd transaction.
When these requests reached the repository code, they request a new (2nd) connection from the connection-pool. But this call will block because there are currently all connections in use.
So we have a nasty application level deadlock
So the solution was to fix the application code (the intermediate function must pass down the transaction correctly). Then everything works.
Moreover I strongly recommend to set a sensible idle_in_transaction_session_timeout and connection-timeout. Then, even if such an application-deadlock is introduced again in future versions, the application can recover automatically after this timeout.
Notes:
pg-postgres before v 10.3.4 contained a small bug #682 related to the connection-timeout
pg-promise before version 10.3.5 could not reocver from an idle-in-transaction-timeout and left the connection in a broken state: see pg-promise #680
Basically there was another issue: there was no need to use a transaction - because all functions were just reading data: so we can just use Database.task() instead of Database.tx()

Cassandra timeout during write query but entry present in Datebase

We are using Cassandra 3.0 on our system. For insertion in the db, we are using the Datastax C# driver.
We have a query regarding the timeout and retry during insertion. We faced an instance where a timeout during insert was thrown yet there is that entry present in the database. All are settings are default in the Cassandra.yaml file as well as in the driver.
How can we know the actual status of the insert even if there is a timeout? If there was a timeout thrown, how could possibly the insert have gone through ahead? Whether the insert was successful or there was some default retry policy in place that was applied, we don't have any tangible answer on it currently and we need to know exactly about that.
How do we make sure that the status of that insertion was actually successful/failed with or without the timeout?
A write timeout is not necessarily a failure to write, moreover it's a notification that not enough replicas acknowledged the write within a time period. The write will still eventually happen on all replicas.
If you do observe a write timeout, it indicates that not enough replicas responded for the configured consistency level within the configured write_request_timeout_in_ms value in cassandra.yaml, the default being 2 seconds. Keep in mind however that the write will still happen.
The coordinating Cassandra node responsible for that write sends write mutations to all replicas and responds to the client as soon as enough have replied or the timeout is reached. Because of this, if you get a WriteTimeoutException you should assume the write happened. If any of the replicas are down, the coordinator maintains a hint for that write, which will be delivered to the replica when it becomes available again.
Cassandra also employs Read Repairs and Operators should run recurring Repairs to help keep data consistent.
If your operations are idempotent, you can simply retry the write until it succeeds. Or you can attempt to read the data back to make sure the write was processed. However, depending on your application requirements, you may not need to employ these strategies and you can safely assume the write did or will happen.
Please note on the other hand that unavailable errors (i.e. Not enough replicas available at consistency level X) indicate that not enough replicas were available to perform a write and therefore the write is never attempted.

knex migration error in node js app

I am using knew to connect with postgres in my application. I am getting following error when I run
knex migrate:latest
TimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
at Timeout._onTimeout
Referring some thread , I understand that I have to add transacting call but Do I need to add in all the sql calls of my app ?
In documentation , It do not give me details about when to add this ? why is must ? My queries are mostly of type "GET", hence not sure if those queries needs to apply transacting?
It seems a library bug, probably.
Generally speaking, any behaviors including SELECT also need a transaction with read locking. DB will organize the resource locking sequence according to the transaction isolation level setting and mostly READ COMMITTED is default. Rows in a table cannot be deleted while a user is reading it until finished the action. Delete (exclusive locking) waits until the Select (read shared lock) release it, even if we didn't mention a begin transaction.
In this reason, most of the database connection libraries are supporting "auto commit" option like this, this and this to automatically wrap with a transaction by default if there is no explicit transaction made (or supported by the DBMS session option natively), so all the request run on a transaction block.
Knex seems not have this option explicitly. I can find
it may differ to the DBMS types. Oracle dialect. While reading the code, I found Oracle implementation have it here but Postgresql implementation here does not have auto commit. It looks incomplete to me.
The document also says it could select query without transacting call. If it leaks many open session, then it's obviously a bug. Please file a bug report with a sample code to reproduce this issue.
Or you could inspect what queries in the pending list from the database side. All the modern database system could list up the sessions and locking status. I suppose you have mixed with the naive select call and the transacting() call and then the naive select calls may appended to an uncommitted open transaction. You can watch what is happening from the DB admin feature like this.

Handling failures in Thrift in general

I read through the official documentation and the official whitepaper, but I couldn't find a satisfying answer to how Thrift handles failures in the following scenario:
Say you have a client sending a method call to a server to insert an entry in some data structure residing in that server (it doesn't really matter what it is). Suppose the server has processed the call and inserted the entry but the client couldn't receive a response due to a network failure. In such a case, how should the client handle this? A simple retry of sending the call would possibly result in a duplicate entry being inserted. Does the Thrift library persist the response somewhere so that it can resend to the client when it is back online? Or is it the application's responsibility to do so?
Would appreciate it if someone could point out the details of how it works, besides directing to its source code.
The question is an interesting one, but it is by no means limited to Thrift. A better name would be
Handling failures in asynchronous or remote calls in general
because that's in essence, what it is. Altough in the specific case of an RPC-style API like, for example, a Thrift service, the client blocks and it seems to be an synchronous call, it really isn't that way.
The whole problem can be rephrased to the more general question about
Designing robust distributed systems
So what is the main problem, that we have to deal with? We have to assume that every call we do may fail. In particular, it can fail in three ways:
request died
request sent, server processing successful, response died
request sent, server processing failed, response died
In some cases, this is not a big deal, regardless of the exact case we have. If the client just wants to retrieve some values, he can simply re-query and will get some results eventually if he tries often enough.
In other cases, especially when the client modifies data on the server, it may become more problematic. The general recommendation in such cases is to make the service calls idempotent, meaning: regardless, how often I do the same call, the end result is always the same. This could be achieved by various means and more or less depends on the use case.
For example, one method is it to send some logical "ticket" values along with each request to filter out doubled or outdated requests on the server. The server keeps track and/or checks these tickets, before the processing starts eventually. But again, if that method suits your needs depends on your use case.
The Command and Query Responsibility Segregation (CQRS) pattern is another approach to deal with the complexity. It basically breaks the API into setters and getters. I'd recommend to look into that topic, but it is not useful for every scenario. I'd also recommend to look at the Data Consistency Primer article. Last not least the CAP theorem is always a good read.
Good Service/API design is not simple, and the fact, that we have to deal with a distributed parallel system does not make it easier, quite the opposite.
Let me try to give a straight answer.
... is it the application's responsibility to do so?
Yes.
There're 4 types of Exceptions involved in Thrift RPC, including TTransportException, TProtocolException, TApplicationException, and User-defined exceptions.
Based on the book Programmer's Guide to Apache Thrift, the former 2 are local exceptions, while the latter 2 are not.
As the names imply, TTransportException includes exceptions like NOT_OPEN, TIMED_OUT, and TProtocolException includes INVALID_DATA, BAD_VERSION, etc. These exceptions are not propagated from the server the the client and act much like normal language exceptions.
TApplicationExceptions involve problems such as calling a method that isn’t implemented or failing to provide the necessary arguments to a method.
User-defined Exceptions are defined in IDL files and raised by the user code.
For all of these exceptions, no retry operations are done by Thrift RPC framework itself. Instead, they should be handled properly by the application code.

Resources