Is DbSet.Add (and other non-DB operations) thread safe? - multithreading

I get it that all the operations involving the DB access should not be called in parallel. Creating DbContext is cheap, use the new one, all that.
But what about the local operations, like DbSet.Add(...), or DbSet.Local.<...>? They happen almost instantly, so the chances of race conditions are extremely low, but still. What are the underlying containers in DbSet? Do they support thread-safe operations?

Based on a GitHub issue and this answer, DbSet is not considered thread-safe. The responses from the GitHub issue indicate that anything in EFCore that is not a singleton should be considered non-thread-safe.

Related

DbContext & DbcontextPool in ef-core

I read a lot of documents and articles about DBcontext in Efcore and its lifetime, however, I have some questions.
Based on this link "https://learn.microsoft.com/en-us/ef/core/dbcontext-configuration/" the best lifetime of DBcontext and default lifetime of AddDbContext is scope, but there is a contradiction in the two below sentences on this document.
"DbContext is not thread-safe. Do not share contexts between
threads. Make sure to await all async calls before continuing to use
the context instance."
on the other hand, it was mentioned too,
"Dbcontext is safe from concurrent access issues in most
ASP.NET Core applications because there is only one thread executing
each client request at a given time, and because each request gets
a separate dependency injection scope (and therefore a separate
DbContext instance)."
I was confused about whether registering DBcontext as a scoped service is thread-safe or not?
What are the problems of registering DBcontext as a singleton service in detail?
In addition, I read some docs that prohibit registering singleton DbContext, however, AddDbContextPool makes to register singleton DBcontext.
so there are some questions about the Dbcontextpool.
what are the impacts of using the DbContextPool instead of the DbContext?
when we should use it and what should be considered when we use contextPool?
DbContextPool is thread-safe?
Has it memory issues because of storing a number of dbset instances throughout the application's lifetime?
change-tracking or any parts of the ef would be failed or not in the DB context pool?
One DbContext per web request... why?
.NET Entity Framework and transactions
I understand why you think the language in the Microsoft documents is confusing. I'll unravel it for you:
"DbContext is not thread-safe." This statement means that it's not safe to access a DbContext from multiple threads in parallel. The stack overflow answers you already referenced, explain this.
"Do not share contexts between threads." This statement is confusing, because asynchronous (async/await) operations have the tendency to run across multiple threads, although never in parallel. A simpler statement would be: "do not share contexts between web requests," because a single web request typically runs a single unit of work and although it might run its code asynchronously, it typically doesn't run its code in parallel.
"Dbcontext is safe from concurrent access issues in most ASP.NET Core applications": This text is a bit misleading, because it might make the reader believe that DbContext instances are thread-safe, but they aren't. What the writers mean to say here is that, with the default configuration (i.e. using AddDbContext<T>(), ASP.NET Core ensures that each request gets its own DbContext instance, making it, therefore, "safe from concurrent access" by default.
1 I was confused about whether registering DBcontext as a scoped service is thread-safe or not?
DbContext instances are by themselves not thread-safe, which is why you should register them as Scoped, because that would prevent them from being accessed from multiple requests, which would make their use thread-safe.
2 What are the problems of registering DBcontext as a singleton service in detail?
This is already described in detail in this answer, which you already referenced. I think that answer goes into a lot of detail, which I won't repeat here.
In addition, I read some docs that prohibit registering singleton DbContext, however, AddDbContextPool makes to register singleton DBcontext. so there are some questions about the Dbcontextpool.
The DbContext pooling feature is very different from registering DbContext as singleton, because:
The pooling mechanism ensures that parallel requests get their own DbContext instance.
Therefore, multiple DbContext instances exist with pooling, while only a single instance for the whole application exists when using the Singleton lifestyle.
Using the singleton lifestyle, therefore, ensures that one single instance is reused, which causes the myriad of problems laid out (again) here.
The pooling mechanism ensures that, when a DI scope ends, the DbContext is 'cleaned' and brought back to the pool, so it can be reused by a new request.
what are the impacts of using the DbContextPool instead of the DbContext?
More information about this is given in this document.
when we should use it and what should be considered when we use contextPool?
When your application requires the performance benefits that it brings. This is something you might want to benchmark before deciding to add it.
DbContextPool is thread-safe?
Yes, in the same way as registering a DbContext as Scoped is thread-safe; in case you accidentally hold on to a DbContext instance inside an object that is reused accross requests, this guarantee is broken. You have to take good care of Scoped objects to prevent them from becoming Captive Dependencies.
Has it memory issues because of storing a number of dbset instances throughout the application's lifetime?
The memory penalty will hardly ever be noticable. The so-called first-level cache is cleared for every DbContext that is brought back to the pool after a request ends. This is to prevent the DbContext from becoming stale and to prevent memory issues.
change-tracking or any parts of the ef would be failed or not in the DB context pool?
No, it doesn't. For the most part, making your DbContext pooled is something that only requires infrastructural changes (changes to the application's startup path) and is for the most part transparent to the rest of your application. But again, make sure to read this to familiar yourself with the consequences of using DbContext pooling.

Should diesel be run using a sync actor, actix_web::web::block or futures-cpupool?

Background
I am working on an actix-web application using diesel through r2d2 and am unsure of how to best make asynchronous queries. I have found three options that seem reasonable, but am unsure of which one is best.
Potential Solutions
Sync Actor
For one I could use the actix example, but it is quite complicated and requires a fair deal of boilerplate to build. I hope there exists a more reasonable solution.
Actix_web::web::block
As another option I could use the actix_web::web::block to wrap my query functions into a future, but I am unsure of the performance implications of this.
Is the query then running in the same Tokio system? From what I could find in the source, it creates a thread in the underlying actix-web threadpool. Is that a problem?
If I read the code right, r2d2 blocks its thread when acquiring a connection, which would block part of the core actix-web pool. Same with database queries. This would then block all of actix-web if I do more queries than I have threads in that pool? If so, big problem.
Futures-cpupool
Finally, the safe bet that may have some unneeded overhead is futures-cpupool. The main issue is that this means adding another crate to my project, though I don't like the idea of multiple cpu-pools floating around in my application needlessly.
Since both r2d2 and diesel will block there are a surprising amount of tricky things in here.
Most importantly, do not share this cpupool with anything not using the same r2d2 pool (as all threads created may just block waiting for an r2d2 connection, locking down the whole pool when work exists).
Secondly (a bit more obviously), you thus shouldn't have more r2d2 connections than threads in the pool and vice-versa since the bigger one would waste resources (connections unused/threads constantly blocked) (perhaps one more thread, for maybe quicker connection handover by the OS scheduler rather than the cpupool scheduler).
Finally, mind what database you are using and the performance you have there. Running a single connection r2d2 and a single thread in the pool might be best in a write heavy sqlite application (though I would recommend a proper database for such).
Old answers
Old solutions that may work
https://www.reddit.com/r/rust/comments/axy0hp/patterns_to_scale_actixweb_and_diesel/
In essence, recommends Futures-cpupool.
What is the best approach to encapsulate blocking I/O in future-rs?
Recommends Futures-cpupool for general cases.
Old solutions that don't work
https://www.reddit.com/r/rust/comments/9fe1ye/noob_here_can_we_talk_about_async_and_databases/
A really nice fix for a old actix-web version. From what I can find requests no longer have a cpu-pool in them.
I am going with futures-cpupool. It is the best solution due to the blocking nature of my interactions.
Using actix_web::web::block is decent enough, but will use a shared thread-pool in actix (and due to the blocking calls I use this can block the entire thread pool and interfere with other tasks given to actix_web).
It is better to use futures-cpupool to create a separate threadpool per database just for database interactions. This way you group all the tasks that need to wait for each other (when there are more tasks than connections) into one pool, preventing them from blocking any other tasks that don't need a connection and potentially limiting the number of threads to the number of connections (so that the task will only be scheduled when it won't be blocked).
In the case where you only want to use one database connection (or very few) the sync actor is a pretty good option. It will act like a futures-cpupool with one thread, ensuring that all tasks are run one at a time, except that it will use one of actix-web's underlying threads rather than a separate one (therefore, only good for very few connections). I find the boilerplate too big to be worth it, though.

spring security strategy MODE_INHERITABLETHREADLOCAL. Why?

I understand how and what happens when we use MODE_THREADLOCAL and MODE_INHERITABLETHREADLOCAL in Spring Security Strategy. What I don't understand is, why would someone use MODE_THREADLOCAL over MODE_INHERITABLETHREADLOCAL.
Is there a memory impact with using one over the other. If so, is it
significant enough?
What is a typical business/functional usecase for using MODE_INHERITABLETHREADLOCAL?
Any performance different with using one over the other?
The memory impact of using the two is negligible
In some environments, it is common to spin up new Threads to do background tasks. Sometimes developers do not want the Thread that is created to contain a SecurityContext automatically. In these instances, MODE_THREADLOCAL is preferable. If you spin up a task on behalf of the current user, then it may be desirable to propagate the SecurityContext. In this instance MODE_INHERITABLETHREADLOCAL would be preferrable.
Performance between the two strategies is negligible

Dependency between message queuing messages

Here is my scenario:
I have two servers with a multi-threaded message queuing consumer on each (two consumers total).
I have many message types (CreateParent, CreateChild, etc.)
I am stuck with bad legacy code (creating a child will partially creates a parent. I know it is bad...But I cannot change that.)
Message ordering cannot be assume (message queuing principle!)
RabbitMQ is my message queuing broker.
My problem:
When two threads are running simultaneous (one executing a CreateParent, the other executing a CreateChild), they generate conflicts because the two threads try to create the Parent in the database (remember the legacy code!)
My initial solution:
Inside the consumer, I created an "entity locking" concept. So when the thread processes a CreateChild message for example, it locks the Child and the Parent (legacy code!!) so the CreateParent message processing can wait. I used basic .net Monitor and list of Ids to implement this concept. It works well.
My initial solution limitation:
My "entity locking" concept works well on a single consumer in a single process on a single server. But it will not works across multiple servers running multiple consumers.
I am thinking of using a shared database to "store" my entity locking concept, so each processes (and threads) could access the database to verify which entities are locked.
My question (finally!):
All this is becoming very complex and it increases the bugs risk and code maintenance problems. I really don`t like it!
Does anyone already faced this kind of problem? Are they acceptable workarounds for it?
Does anyone have an idea for a clean solution for my scenario?
Thanks!
Finally, simple solutions are always the better ones!
Instead of using all the complexity of my "entity locking" concept, I finally turn down to pre-validate all the required data and entities states before executing the request.
More precisely, instead of letting CreateChild process crashes by itself when it encounter already existing data created by the CreateParent, I fully validate that everything is okay in the databases BEFORE executing the CreateChild message.
The drawback of this solution is that the implementation of the CreateChild must be aware of what of the specific data the CreateParent will produces and verify it`s presence before starting the execution. But seriously, this is far better than locking all the stuff in cross-system!

Working with Instance count in worker role

While using multiple instances in worker role will there not be thread synchronization issues. My doubt is whether two instances might try to pick the same record and process the same. How to solve this issue.
Thanks
Not threading issues, but concurrency issues. Yes, there will be issues.
However, these issues are not different from normal concurrency issues that you might have with even a single web server receiving simultaneous requests.
The most common way to deal with concurrency issues is through the use of Optimistic Concurrency.
a common solution within the Windows Azure Platform for allocating work out to multiple worker processes is the use of Azure Storage Queues. This helps minimize the risk of two threads or even two roles working on a single item concurrently. However, there is a wee bit of additional work that is required to make this fully functional and ensure that the queue behavior is properly accounted for.
I wouldn't recomend use multiple single-thread roles in order to avoid threading. It would be more expensive, and as #Mark has pointed out, you will end facing almost the same problems.

Resources