Oracle Concurrency Handling - multithreading

I have multi thread environment(Multiple Nodes),where each thread calls a select query which has n rows eligible,but I have configure rownum<6,followed by a update for the same 5 records selected.These are done in a single transaction through java program.
My Requirement:-
while these 5 records are processed bye one thread in one process,all other thread need to pick 5 records each from n-5 records and process ?
If 2 threads has been initiated at same time how I can handle the situation? So two threads are going to pick different data set.
Oracle Version :- 11.2.0.2.0

Related

Spring Batch Retry using Remote Partitioning and Thread Pool Executor trying to process same record twice

I have a spring-batch service using remote partitioning and a chunk size of 10. If an item in a chunk fails, there is a retry limit of 3. I am using thread pool executor.
I observed that during Retry, a forkjoinpool worker 0 is being spawned and that processes the records in the failed chunk. Other than that, the thread where the chunk had originally failed also tries to process the records in the chunk simultaneously. As the record was already processed by the fork join pool, the processor is returning a null and thus the filter count in the batch step execution table gets updated instead of the write count. How to prevent multiple processing of the same record in the failed chunk, if it has successfully been executed by another thread.
How to prevent multiple processing of the same record in the failed chunk, if it has successfully been executed by another thread.
This means the same item can be processed by two different threads, which should not be the case. Each partition should be a distinct data set so that each worker processes distinct items. Other than that, your item processor should be idempotent in a fault tolerant step, see https://docs.spring.io/spring-batch/docs/current/reference/html/processor.html#faultTolerant

How to use synchronous messages on rabbit queue?

I have a node.js function that needs to be executed for each order on my application. In this function my app gets an order number from a oracle database, process the order and then adds + 1 to that number on the database (needs to be the last thing on the function because order can fail and therefore the number will not be used).
If all recieved orders at time T are processed at the same time (asynchronously) then the same order number will be used for multiple orders and I don't want that.
So I used rabbit to try to remedy this situation since it was a queue. It seems that the processes finishes in the order they should, but a second process does NOT wait for the first one to finish (ack) to begin, so in the end I'm having the same problem of using the same order number multiple times.
Is there anyway I can configure my queue to process one message at a time? To only start process n+1 when process n has been acknowledged?
This would be a life saver to me!
If the problem is to avoid duplicate order numbers, then use an Oracle sequence, or use an identity column when you insert into a table to generate the order number:
CREATE TABLE mytab (
id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY(START WITH 1),
data VARCHAR2(20));
INSERT INTO mytab (data) VALUES ('abc');
INSERT INTO mytab (data) VALUES ('def');
SELECT * FROM mytab;
This will give:
ID DATA
---------- --------------------
1 abc
2 def
If the problem is that you want orders to be processed sequentially, then don't pull an order from the queue until the previous one is finished. This will limit your throughput, so you need to understand your requirements and make some architectural decisions.
Overall, it sounds Oracle Advanced Queuing would be a good fit. See the node-oracledb documentation on AQ.

Is it beneficial to use DataAdapter.Update to update millions of records in a DataBaseTable?

I am supposed to update 2-3 fields in a DataBase Table having 10s of millions of records. I am doing the operation in a .Net application in batches of 100K(recursively) and updating the table by regular ADO.Net code and executing Stored Procs to update the table. This process is estimated to take 30 hours(probably because of IO and server roundtrips) like this and I have to do it in just 4.
Would DataAdapter.Update be any faster? Any suggestions on improving speed greatly appreciated.
well i also have the same problem but we solve it my using the threading concept.
First make a list of dataset for 1000 or 10000 record in each
Than create a thread pool by calling
Task tsk = Task.factory.StartNew(()=>function_Name(list object));
Define the function_name and perform batch update using dataadapter
You will get huge difference ......
The pooled thread fired concurrently.

C# 2 instances of same app reading from same SQL table, each row processed once

I'm writing a Windows Service in C# in .Net 4.0 to forfill the following functionality:
At a set time every night the app connects to SQL Server, opens a User table and for each record retrieves the user's IP address, does a WCF call to the user's PC to determine if it's available for transacting and inserts a record into a State table (with y/n and the error if there is one).
Once all users have been proccessed the app then reads each record in the State table where IsPcAvailable = true, retrieves a list of reports for that user from another table and for each report fetches the report from the Enterprise doc repository, calls the user's PC via WCF and pushes the report onto their harddrive, then updates the state table to its success.
The above senario is easy enough to code if single threaded running on 1 app server; but due to redundancy & performance there will be at least 2 app servers doing exactly the same thing at the same time.
So how do I make sure that each user is processed only once in firstly the User table then the State table (same problem) as fetching the reports and pushing them out to PCs all across the country is a lengthy process. And optimally the app should be multithreaded, so for example, having 10 threads running on 2 servers processing all the users.
I would prefer a C# solution as I'm not a DataBase guru :) The closest I've found to my problem is:
SQL Server Process Queue Race Condition - it uses SQL code
and multithreading problems with the entity framework, I'm probally going to have to go 1 layer down and use ADO.net?
I would recommend using the techniques at http://rusanu.com/2010/03/26/using-tables-as-queues/ That's an excellent read for you at this time.
Here is some sql for a fifo
create procedure usp_dequeueFifo
as
set nocount on;
with cte as (
select top(1) Payload
from FifoQueue with (rowlock, readpast)
order by Id)
delete from cte
output deleted.Payload;
go
And one for a heap (order does not matter)
create procedure usp_dequeueHeap
as
set nocount on;
delete top(1) from HeapQueue with (rowlock, readpast)
output deleted.payload;
go
This reads so beautifully its almost poetry
You could simply just have each application server polling a common table (work_queue). You can use a common table expression to read/update the row so you don't have them stepping on each other.
;WITH t AS
(
SELECT TOP 1 *
FROM work_queue WHERE NextRun <= GETDATE()
AND IsPcAvailable = 1
)
UPDATE t WITH (ROWLOCK, READPAST)
SET
IsProcessing = 1,
MachineProcessing = 'TheServer'
OUTPUT INSERTED.*
Now you could have a producer thread in your application checking for unprocessed records periodically. Once that thread finishes it's work, it pushes the item in to a ConcurrentQueue and consumer threads can process the work as it's available. You can set the number of consumer threads yourself to the optimum level. Once the consumer threads are done, it simply sets IsProcessing = 0 as to show that the PC was updated.

why does does multi readers performance in SQLite degrade when using multiple connections instead of one serialized?

When doing SELECT which returns more than 1 row concurrently on a number of reader threads, I see that using multiple connections (connection per reader thread) is worse than those of using only one connection for all (in SERIALIZED mode).
I have a database table on a ram disk with 6 INTEGER columns, column[0] is PRIMARY KEY.
I have 10000 entries. all 5 columns are filled with random numbers using rand().
I have 50 reader threads, each doing 50 iterations of:
SELECT * FROM tbl ORDER BY
Test cases are:
1. one connection (SERIALIZED mode)
2. 50 connections (still in SERIALIZED mode)
3. 50 connections (MULTI THREADED mode)
What seems strange is that performance of (2) and (3) are poorer than those of (1).
(2) took about 3 times as (1), and (3) took 4 times as (1).
The average iteration time was significantly higher and so does the longest iteration time.
The only suspicious clue I have is that in (1) the total amount of system calls time (strace -c) was negligible, but in (2) and (3) nanosleep took 99% of the time.
does anyone have a clue as to why is that?

Resources