Multiple insert in single transaction using multithreading

Multiple insert in single transaction using multithreading - multithreading

I am using springboot Jpa and marked a method as transactional.
In the transactional method, multiple inserts are there which are independent of each other.
I am trying to improve the performance by using multiple threads inside the transactional method but Transactional behavior is lost when I did this.
Can you please suggest a way to achieve this behavior.
#Transactional
methodA(){
repo1.save(A)
repo2.save(B)
repo3.save(C)
repo4.save(D)
}
I want to run these save operations in different threads but in one single transaction.

Related

Synchronicity of Azure stored procedures [duplicate]

Can documentDb stored procedures run in parallel and update the same object? Will documentDb process them sequentially?
Consider the following scenario.
I have an app and I have 10000 coins to give away to my users when they complete a task. And I have the following object
{
remainingPoints: 10000
}
I have a stored procedure that subtracts 10 points from this object and adds them to the users' points.
Now lets say 10 users complete the task at the same time and I call the stored procedure 10 times at the same time, will DocDb execute them sequentially? Or will I have to execute the stored procedures sequentially?

I had similar questions when I first started using DocumentDB and got good answers here and in email from the DocumentDB product managers. Quoting:
Stored procedures ... get an isolated snapshot of the database for transactional support. The snapshot reflects the current state of the world (no stale data) at the time the sproc begins execution (strongly consistent).
Caveat – since stored procedures are operating on a snapshot, you can still get a stale read in a sproc if a new write come in from the outside world during execution.
Also, stored procedures will ALWAYS read their owns writes.
Sprocs are DocumentDB’s mechanism for multi-document transactions. Sproc writes are committed when a sproc successfully complete execution. If an exception is thrown, all work done in a sproc gets rolled back.
So if two are sprocs are running concurrently, they won’t see eachother’s writes.
If both sprocs happen to write to the same document (replace) – then the 2nd one will fail due to an etag mismatch when it attempts to commit writes.
From that, I went forward with my design making sure to use ETags in my writes as #Julian suggests. I also automatically retry up to 3 times each sproc execution to handle the case where they fail due to parallel operations among other reasons. In practice, I've never exceed the 3 retries (except in cases where my sproc had a bug) and I rarely even get a single retry.
I assume from the behavior that I observe that it sends each new sproc execution to a different replica until it runs out of replicas and then it queues them for sequential execution, so it's a hybrid of parallel and serial execution.
One other tip that I learned through experimentation is that you are better off doing pure read operations (no writes and no significant aggregation) client-side rather than in a sproc when you are on a heavily loaded system. I assume the advantage is because DocumentDB can satisfy different reads from different replicas in parallel. I have modularized my sproc code using the expandScript functionality of documentdb-utils to make sure that I use the exact same code for write validation, intra-document consistency, and derived fields both client-side and server-side, which is possible using node.js. Even if you are mostly .NET, you may want to use expandScripts to build your sprocs in a modular DRY way. You'll still need to run node.js in your build process to pre-process your sprocs or use Edge.NET (node running inside of .NET) to do so on the fly.

It will depend on the consistency you have choose for your collection. But the idea is that DocumentDb handle concurrency using etag and executes stored procedure on a snapshot of a document version, and commit the result only if the execution succeed.
See: https://azure.microsoft.com/en-us/documentation/articles/documentdb-faq/#develop
This thread may help too: Atomically increment an integer in a document in Azure DocumentDB

How to achieve locking across multiple table updates in Cassandra so as to attain isolation and avoid dirty read probem

I am using Cassandra as a NoSQL DB in my solution and have a Data model wherein I have 2 tables , one is parent table and other one is child table
Here is the scenario
client A is trying to update a parent table record as well child table records
At the same time, client B also does select request (which makes a hit to both parent and child table)
client B receives latest record from Parent table but gets older record from Child table
I can use a batch log operation so that I can achieve atomicity for updating both the tables but not sure how to isolate or lock the read request from Client B so as to avoid having dirty read problem.
Have also tried evaluating light weight transactions but doesnt seem to work in this case
Just thinking if I can use some middleware application to implement locking functionality since there seems to be nothing available in Cassandra out of the box.
Please help to make me understand how to achieve read/write sync in this regard

As you mentioned - Cassandra provides only atomicity when you choose to batch. It does provide isolation though when you make a single partition batch, which is not your case unfortunately.
To respond to your question - if you really need transaction I would think about the problem and possible solutions once again. Either you should eliminate the need of locking or you should change the technology stack.

How can i use parallel transactions in neo4j?

I am currently working on an application using Neo4j as an embedded database.
And I wondering how it would be possible to make sure that separate threads use separate transactions. Normally, I would assign database operations to a transaction, but the code examples I found, don't allow for making sure that write operations use separate transactions:
try (Transaction tx = graphDb.beginTx()) {
Node node = graphDb.createNode();
tx.success();
}
As graphDB shall be used as a thread-safe singleton, I really don't see, how that shall work... (E.g. for several users creating a shopping list in separate transactions.)
I would be grateful for pointing out where I misunderstand the concept of transactions in Neo4j.
Best regards and many thanks in advance,
Oliver

The code you posted will run in separate transactions if executed by multiple threads, one transaction per thread.
The way this is achieved (and it's quite a common pattern) is storing transaction state against ThreadLocal (read the Javadoc and things will become clear).

Neo4j Transaction Management
In order to fully maintain data integrity and ensure good transactional behavior, Neo4j supports the ACID properties:
atomicity: If any part of a transaction fails, the database state is left unchanged.
consistency: Any transaction will leave the database in a consistent state.
isolation: During a transaction, modified data cannot be accessed by other operations.
durability: The DBMS can always recover the results of a committed transaction.
Specifically:
-All database operations that access the graph, indexes, or the schema must be performed in a transaction.
Here are the some useful links to understand Neo4j transactions
http://neo4j.com/docs/stable/rest-api-transactional.html
http://neo4j.com/docs/stable/query-transactions.html
http://comments.gmane.org/gmane.comp.db.neo4j.user/20442

Is Linq to Sql DeleteOnSubmit threadsafe?

Assuming the decision to delete is independent and threadsafe, is it threadsafe to call DeleteOnSubmit in parallel?
All entities will be added to the delete in parallel, then the change will be submitted afterwards.
My testing hasn't shown a problem, but that doesn't inherently mean it's safe....

No.
LINQ to SQL itself is not thread-safe; nor are any of its methods.

Multithread UnitOfWork in Nhibernate

We are working on a c# windows service using NHibernate which is supposed to process a batch of records.
The service has to process about 6000 odd records and its taking about 3 hours at present to process these. There are a lot of db hits incurred and while we are trying to minimize these , we are also exploring multithreading options to improve performance.
We are using the UnitOfWork pattern to access the NHibernate session.
This is roughly how the service looks :
public class BatchService
{
public DoWork()
{
StartUnitOfWork();
foreach ( var record in recordsToBeProcessed)
{
Process(record);
// Perform lots of db operations
}
StopUnitOfWork();
}
}
We were thinking of using the Task Parallel Library to try to process these records in batches ( using the Parallel.Foreach () method).
From what I have read about NHibernate so far , we should provide each thread a separate NHibernate session.
My query is how do we supply this ..considering the UnitOfWork pattern which only allows one session to be available.
Should I be looking at wrapping a UnitOfWork around the processing of a single record ?
Any help much appreciated.
Thanks

The best way is to start a new unitofwork for each thread, use a thread-static contextual session NHibernate.Context.ThreadStaticSessionContext. You must be aware of dettached entities.

The easiest way is to wrap each processing of a record in it's own unit of work and then run each UOW on it's own thread. You need to make sure that each UOW & session is started, used and completed on a single thread.
To gain performance you could split the batch of records in smaller batches and then wrap the processing of this smaller batches into UOWs and execute them on separate threads.
Depending on your workload using a second level cache (memcached/membase) might dramatically improve your performance. (eg if you need to read some records from the db for each processing )

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string