How to Avoid update deadlock - multithreading

While working on JDBC with Postgres...
Isolationlevel="Read committed"
I got the same deadlock in a multithreaded environment when I tried updating the table after some operations.
So I tried using multiple queries as shown below
ps = con.prepareStatement("UPDATE TableA SET column1=column1-? WHERE column2=? and column3=?;"
+ "UPDATE TableA SET column1=column1+? WHERE column2=? and column3=?;");
Here are the postgresql logs for the error
2016-12-19 12:25:44 IST STATEMENT: UPDATE TableA SET column1=column1+$1 WHERE column2=$2 and column3=$3
2016-12-19 12:25:44 IST FATAL: connection to client lost
2016-12-19 12:25:45 IST ERROR: deadlock detected
2016-12-19 12:25:45 IST DETAIL: Process 8524 waits for ShareLock on transaction 84942; blocked by process 12520.
Process 12520 waits for ShareLock on transaction 84940; blocked by process 20892.
Process 20892 waits for ExclusiveLock on tuple (1,5) of relation 25911 of database 24736; blocked by process 8524.
Process 8524: UPDATE TableA SET column1=column1-$1 WHERE column2=$2 and column3=$3
Process 12520: UPDATE TableA SET column1=column1-$1 WHERE column2=$2 and column3=$3
Process 20892: UPDATE TableA SET column1=column1-$1 WHERE column2=$2 and column3=$3
2016-12-19 12:25:45 IST HINT: See server log for query details.
2016-12-19 12:25:45 IST CONTEXT: while locking tuple (1,12) in relation "TableA"
2016-12-19 12:25:45 IST STATEMENT: UPDATE TableA SET column1=column1-$1 WHERE column2=$2 and column3=$3
2016-12-19 12:25:45 IST LOG: could not send data to client: No connection could be made because the target machine actively refused it.
In this multithreaded environment, I was expecting TableA's rows to get locked for the 2 statements and avoid deadlock.
I see similar scenario explained in
Postgres Docs
I could not find any method to avoid this kind of deadlock. Any help is appreciated. Thanks
P.S: Autocommit is set FALSE already and tried using preparedStatements with single UPDATE query.
Regarding multiple queries -> Multiple queries in one Preparedstatement and this shows that postgres doesnt need any additional configurations.

As #Nick Barnes quoted in comment from the link I shared.
The best defense against deadlocks is generally to avoid them by being
certain that all applications using a database acquire locks on
multiple objects in a consistent order.
Especially for update deadlocks as mentioned, the order of update leads to deadlock.
Example:
UPDATE Table SET ... WHERE id= 1;
UPDATE Table SET ... WHERE id= 2;
and
UPDATE Table SET ... WHERE id= 2;
UPDATE Table SET ... WHERE id= 1;
The general solution is to order the updates based on id. This is what the consistent order meant.
I didn't understand that till I struggle with this deadlock.

By my experience deadlock in PostgreSQL most likely happens in "real life" when you "cross" updates in 2 long running transactions. (explanation follows). And it is quite hard to even simulate deadlock on PG. Here is an example - http://postgresql.freeideas.cz/simulate-deadlock-postgresql/
So classical deadlock means:
You start transaction 1 + do update on row 1. Your procedure continues but transaction 1 is still opened and not commited.
Then you start transaction 2 + do update on row 2. Procedure continues but transaction 2 is still opened and not commited.
Now you try to update row 2 from transaction 1 which causes this operation to wait.
Now you try to update row 1 from transaction 2 - in this moment PG reports deadlock and ends this transaction.
So I recommend you to commit transactions as soon as possible.

Related

How to copy managed database?

AFAIK there is no REST API providing this functionality directly. So, I am using restore for this (there are other ways but those don’t guarantee transactional consistency and are more complicated) via Create request.
Since it is not possible to turn off short time backup (retention has to be at least 1 day) it should be reliable. I am using current time for ‘properties.restorePointInTime’ property in request. This works fine for most databases. But one db returns me this error (from async operation request):
"error": {
"code": "BackupSetNotFound",
"message": "No backups were found to restore the database to the point in time 6/14/2021 8:20:00 PM (UTC). Please contact support to restore the database."
}
I know I am not out of range because if the restore time is before ‘earliestRestorePoint’ (this can be found in GET request on managed database) or in future I get ‘PitrPointInTimeInvalid’ error. Nevertheless, I found some information that I shouldn’t use current time but rather current time - 6 minutes at most. This is also true if done via Azure Portal (where it fails with the same error btw) which doesn’t allow to input time newer than current - 6 minutes. After few tries, I found out that current time - circa 40 minutes starts to work fine. But 40 minutes is a lot and I didn’t find any way to find out what time works before I try and wait for result of async operation.
My question is: Is there a way to find what is the latest time possible for restore?
Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?
EDIT:
The issue I was describing was reported to MS. It was occuring when:
there is a custom time zone format e.g. UTC + 1 hour.
Backups are skipped for the source database at the desired point in time because the database is inactive (no active transactions).
This should be fixed as of now (25th of August 2021) and I were not able to reproduce it with current time - 10 minutes. Also I was told there should be new API which would allow to make copy without using PITR (no sooner than 1Q/22).
To answer your first question "Is there a way to find what is the latest time possible for restore?"
Yes. Via SQL. The only way to find this out is by using extended event (XEvent) sessions to monitor backup activity.
Process to start logging the backup_restore_progress_trace extended event and report on it is described here https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/backup-activity-monitor
Including the SQL here in case the link goes stale.
This is for storing in the ring buffer (max last 1000 records):
CREATE EVENT SESSION [Verbose backup trace] ON SERVER
ADD EVENT sqlserver.backup_restore_progress_trace(
WHERE (
[operation_type]=(0) AND (
[trace_message] like '%100 percent%' OR
[trace_message] like '%BACKUP DATABASE%' OR [trace_message] like '%BACKUP LOG%'))
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
ALTER EVENT SESSION [Verbose backup trace] ON SERVER
STATE = start;
Then to see output of all backup events:
WITH
a AS (SELECT xed = CAST(xet.target_data AS xml)
FROM sys.dm_xe_session_targets AS xet
JOIN sys.dm_xe_sessions AS xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'Verbose backup trace'),
b AS(SELECT
d.n.value('(#timestamp)[1]', 'datetime2') AS [timestamp],
ISNULL(db.name, d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)')) AS database_name,
d.n.value('(data[#name="trace_message"]/value)[1]', 'varchar(4000)') AS trace_message
FROM a
CROSS APPLY xed.nodes('/RingBufferTarget/event') d(n)
LEFT JOIN master.sys.databases db
ON db.physical_database_name = d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)'))
SELECT * FROM b
NOTE: This tip came to me via Microsoft support when I had the same issue of point in time restores failing what seemed like randomly. They do not give any SLA for log backups. I found that on a busy database the log backups seemed to happen every 5-10 minutes but on a quiet database hourly. Recovery of a database this way can be slow depending on number of transaction logs and amount of activity to replay etc. (https://learn.microsoft.com/en-us/azure/azure-sql/database/recovery-using-backups)
To answer your second question: "Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?"
I'd have to agree with Thomas - if you're after guaranteed transactional consistency and speed you need to look at creating a failover group https://learn.microsoft.com/en-us/azure/azure-sql/database/auto-failover-group-overview?tabs=azure-powershell#best-practices-for-sql-managed-instance and https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/failover-group-add-instance-tutorial?tabs=azure-portal
A failover group for a managed instance will have a primary server and failover server with the same user databases on each kept in synch.
But yes, whether this suits your needs depends on the question Thomas asked of what is the purpose of the copy.

How to use synchronous messages on rabbit queue?

I have a node.js function that needs to be executed for each order on my application. In this function my app gets an order number from a oracle database, process the order and then adds + 1 to that number on the database (needs to be the last thing on the function because order can fail and therefore the number will not be used).
If all recieved orders at time T are processed at the same time (asynchronously) then the same order number will be used for multiple orders and I don't want that.
So I used rabbit to try to remedy this situation since it was a queue. It seems that the processes finishes in the order they should, but a second process does NOT wait for the first one to finish (ack) to begin, so in the end I'm having the same problem of using the same order number multiple times.
Is there anyway I can configure my queue to process one message at a time? To only start process n+1 when process n has been acknowledged?
This would be a life saver to me!
If the problem is to avoid duplicate order numbers, then use an Oracle sequence, or use an identity column when you insert into a table to generate the order number:
CREATE TABLE mytab (
id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY(START WITH 1),
data VARCHAR2(20));
INSERT INTO mytab (data) VALUES ('abc');
INSERT INTO mytab (data) VALUES ('def');
SELECT * FROM mytab;
This will give:
ID DATA
---------- --------------------
1 abc
2 def
If the problem is that you want orders to be processed sequentially, then don't pull an order from the queue until the previous one is finished. This will limit your throughput, so you need to understand your requirements and make some architectural decisions.
Overall, it sounds Oracle Advanced Queuing would be a good fit. See the node-oracledb documentation on AQ.

Oracle Concurrency Handling

I have multi thread environment(Multiple Nodes),where each thread calls a select query which has n rows eligible,but I have configure rownum<6,followed by a update for the same 5 records selected.These are done in a single transaction through java program.
My Requirement:-
while these 5 records are processed bye one thread in one process,all other thread need to pick 5 records each from n-5 records and process ?
If 2 threads has been initiated at same time how I can handle the situation? So two threads are going to pick different data set.
Oracle Version :- 11.2.0.2.0

Cassandra Counters Double Counting

I am new to Cassandra and am having an issue with counters double counting sometimes. I am trying to keep track of daily event counts for certain events. Here is my table structure:
create table pipes.pipe_event_counts (
count counter,
pipe_id text,
event_type text,
date text,
PRIMARY KEY ((pipe_id, event_type, date))
);
The driver I am using is the Datastax Java driver, and I am compiling and binding parameters to the following prepared statement:
incrementPipeEventCountStatement = CassandraClient.getInstance().getSession().prepare(
QueryBuilder.update("pipes", PIPE_EVENT_COUNT_TABLE_NAME).with(incr("count")).
where(eq("pipe_id", "?")).and(eq("date", "?")).and(eq("event_type", "?")).
getQueryString()
);
incrementPipeEventCountStatement.bind(
event.getAttrubution(Meta.PIPE_ID), dateString, event.getType().toString()
)
The problem is very weird. Sometimes when I process a single event, the counter increments properly by 1. However, the majority of the time, it double increments. I've been looking at my code for some time now and can't find any issues that would cause a second increment.
Is my implementation of counters in Cassandra correct for my use case? I think it is, but I could be losing my mind. I'm hoping someone can help me confirm so I can focus in the right area to find my problem.
Important edit: This is the query I'm running to check the count after the event:
select count from pipes.pipe_event_counts where pipe_id = 'homepage' and event_type = 'click' and date = '2015-04-07';
The thing with counters is that they are not idempotent operations so when you retry (and don't know if your original write was successful) you may end up over-counting.
You can also never re-try and undercount.
As Chris chared, there are some issues with the counter implementation pre-2.1 that make the overcounting issue much more severe. There are also performance issues associated with counters so you want to make sure you look into these in detail before you push a counter deployment to production.
Here are the related Jiras to help you make informed decisions:
Counters ++ (major improvement - fixed 2.1) -- https://issues.apache.org/jira/browse/CASSANDRA-6504
Memory / GC issues from large counter workloads, Counter Column (major improvement - fixed 2.1)--https://issues.apache.org/jira/browse/CASSANDRA-6405
Counters into separate cells (final solution - eta 3.1)- https://issues.apache.org/jira/browse/CASSANDRA-6506

C# 2 instances of same app reading from same SQL table, each row processed once

I'm writing a Windows Service in C# in .Net 4.0 to forfill the following functionality:
At a set time every night the app connects to SQL Server, opens a User table and for each record retrieves the user's IP address, does a WCF call to the user's PC to determine if it's available for transacting and inserts a record into a State table (with y/n and the error if there is one).
Once all users have been proccessed the app then reads each record in the State table where IsPcAvailable = true, retrieves a list of reports for that user from another table and for each report fetches the report from the Enterprise doc repository, calls the user's PC via WCF and pushes the report onto their harddrive, then updates the state table to its success.
The above senario is easy enough to code if single threaded running on 1 app server; but due to redundancy & performance there will be at least 2 app servers doing exactly the same thing at the same time.
So how do I make sure that each user is processed only once in firstly the User table then the State table (same problem) as fetching the reports and pushing them out to PCs all across the country is a lengthy process. And optimally the app should be multithreaded, so for example, having 10 threads running on 2 servers processing all the users.
I would prefer a C# solution as I'm not a DataBase guru :) The closest I've found to my problem is:
SQL Server Process Queue Race Condition - it uses SQL code
and multithreading problems with the entity framework, I'm probally going to have to go 1 layer down and use ADO.net?
I would recommend using the techniques at http://rusanu.com/2010/03/26/using-tables-as-queues/ That's an excellent read for you at this time.
Here is some sql for a fifo
create procedure usp_dequeueFifo
as
set nocount on;
with cte as (
select top(1) Payload
from FifoQueue with (rowlock, readpast)
order by Id)
delete from cte
output deleted.Payload;
go
And one for a heap (order does not matter)
create procedure usp_dequeueHeap
as
set nocount on;
delete top(1) from HeapQueue with (rowlock, readpast)
output deleted.payload;
go
This reads so beautifully its almost poetry
You could simply just have each application server polling a common table (work_queue). You can use a common table expression to read/update the row so you don't have them stepping on each other.
;WITH t AS
(
SELECT TOP 1 *
FROM work_queue WHERE NextRun <= GETDATE()
AND IsPcAvailable = 1
)
UPDATE t WITH (ROWLOCK, READPAST)
SET
IsProcessing = 1,
MachineProcessing = 'TheServer'
OUTPUT INSERTED.*
Now you could have a producer thread in your application checking for unprocessed records periodically. Once that thread finishes it's work, it pushes the item in to a ConcurrentQueue and consumer threads can process the work as it's available. You can set the number of consumer threads yourself to the optimum level. Once the consumer threads are done, it simply sets IsProcessing = 0 as to show that the PC was updated.

Resources