Optimistic concurrency despite lock - multithreading

I have multiple threads running a batch job. When each thread finishes it calls this method of mine:
private static readonly Object lockVar = new Object();
public void UserIsDone(int batchId, int userId)
{
//Get the batch user
var batchUser = context.ScheduledUsersBatchUsers.SingleOrDefault(x => x.User.Id == userId && x.Batch.Id == batchId);
if (batchUser != null)
{
lock (lockVar)
{
context.ScheduledUsersBatchUsers.Remove(batchUser);
context.SaveChanges();
//Try to get the batch with the assumption it has no users left. If we do get the batch back, it means there are no users left.
var dbBatch = context.ScheduledUsersBatches.SingleOrDefault(x => x.Id == batchId && !x.Users.Any());
//So this must have been the last user, the batch is empty, so we fetch it and remove it.
if (dbBatch != null)
{
context.ScheduledUsersBatches.Remove(dbBatch);
context.SaveChanges();
}
}
}
}
What this method does is very simple, it looks up the "BatchUser" to remove him from the queue, which it does. That part works swell.
However, after removing the user I want to check if that was the last user in the whole batch. But since this is multithreaded a race condition can happen.
So I put the removing of the batch user within a lock, after I remove the user, I check if the batch has no more batch users.
But here is my problem... even tho I have a lock, and the query to get the "dbBatch" clearly requires it to have no users to return the object... even so, I sometimes get it back with users like so:
When I do get that, I also get the following error on SaveChanges()
However, at other times I get the dbBatch object back correctly with no children, like so:
And when I do, it all works great, no exceptions.
With debugger I can catch the error by setting a breakpoint on the lock statement (see screenshot one). Then all threads get to the lock (while one goes in). Then I always get the error.
If I only have a breakpoint inside the if-statement it's more random.
With the lock in place, I don't see how this happens.
Update
I Ninject my context, and this is my ninject code
kernel.Bind<MyContext>()
.To<MyContext>()
.InRequestScope()
.WithConstructorArgument("connectionStringOrName", "MyConnection");
kernel.Bind<DbContext>().ToMethod(context => kernel.Get<MyContext>()).InRequestScope();
Update 2
I also tried this solution https://msdn.microsoft.com/en-us/data/jj592904.aspx
But strangely I don't get a DbUpdateConcurrencyException but rather I get a DbUpdateException that has an InnerException that is OptimisticConcurrencyException.
But neither DbUpdateException or OptimisticConcurrencyException contains a Entries property so I can't do ex.Entries.Single().Reload();
I'm also adding the exception in text form here
Here in text also, The outer exception of type DbUpdateException: {"An error occurred while saving entities that do not expose foreign key properties for their relationships. The EntityEntries property will return null because a single entity cannot be identified as the source of the exception. Handling of exceptions while saving can be made easier by exposing foreign key properties in your entity types. See the InnerException for details."}
The InnerException of type OptimisticConcurrencyException: {"Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=472540 for information on understanding and handling optimistic concurrency exceptions."}

Related

How to avoid changing property values in an NSBatchInsertRequest?

I have a simple Core Data entity Story that occasionally I update with the latest data from a network call. This network call sometimes updates many, many stories instances, so I run an NSBatchInsertRequest, shown below. (The other reason I'm using a batch insert is that many stories might need to be added to the persistent store.)
The problem is a user can have already marked a Story as a favorite. When they do that, I set story.isFavorite = true on the main thread and save viewContext.
However, when the batch insert occurs it overwrites story.isFavorite, setting it back to false, even though I'm using NSMergeByPropertyObjectTrumpMergePolicy on both the batch insert and view contexts. I am not touching story.isFavorite in the batch insert handler either so I don't expect that property to be overwritten.
I thought the benefit of a batch insert with this merge policy was to avoid first fetching + then manually updating changed properties + finally saving. What is the right way to avoid changing property values in an NSBatchInsertRequest?
Story
#objc(Story)
public class Story: NSManagedObject {
#NSManaged public var title: String?
#NSManaged public var storyURL: URL?
#NSManaged public var updatedTime: Date?
#NSManaged public var isFavorite: Bool // <- the problem property
}
Batch insert
container.viewContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
container.viewContext.automaticallyMergesChangesFromParent = false
let context = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType)
context.parent = container.viewContext
context.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
context.perform {
let batchInsert = NSBatchInsertRequest(entity: Story.entity(), managedObjectHandler: { managedObject in
let story = managedObject as! Story
let storyResponse = downloadedStories[I]
// Update story with latest response data BUT don't modify story.isFavorite.
story.title = storyResponse.title
story.storyURL = storyResponse.storyURL
story.updatedTime = storyResponse.updatedTime
// ...
})
let result = try context.execute(batchInsert) as? NSBatchInsertResult
if let insertedIDs = result?.result as? [NSManagedObjectID] {
// Merge changes into parent context. Skip save() because not needed for batch insert.
NSManagedObjectContext.mergeChanges(fromRemoteContextSave: [NSInsertedObjectsKey: insertedIDs], into: [container.viewContext])
}
}
Edit
The Story entity does have a unique value constraint using attribute storyURL.
Update after Michael Tsai's answer
By making the Story entity attribute isFavorite a non-Optional Boolean without a default value (it was marked as Optional before, though I'm not sure it makes a difference here) and keeping the Use Scalar Type box checked, I can confirm that existing objects in the store will not be modified (at all) with this configuration of the batch insert context.
context.persistentStoreCoordinator = container.persistentStoreCoordinator
// HOWEVER, observe that regardless of the merge policy below,
// setting `context.parent = container.viewContext` will also
// overwrite the store data!
context.mergePolicy = NSMergeByPropertyStoreTrumpMergePolicy
// NSMergeByPropertyObjectTrumpMergePolicy ignores objects in the store
// (which have the same unique constraint value, here equal `storyURL`)
// and overwrites all properties.
// To confirm that the batch insert operation does not modify
// existing Story instances (at all), first delete all instances where
// where isFavorite == false. Then load the all story data again and
// execute the NSBatchInsertRequest with this change to managedObjectHandler:
story.title = storyResponse.title + " (modified)"
You will see the missing stories get inserted back, this time with their titles having a suffix " (modified)"; but previously favorited stories
do not get modified (basically, with this setup, the batch insert won't re-insert objects).
So the isFavorite property does not get overwritten BUT neither do any properties that should be changed (because they received a new title, for example).
Therefore, if you don't want your objects to get updated, but you want completely new objects to be inserted, you can use this approach.
However, if you are expecting your objects to require updates here are some alternatives:
you may opt to run a separate update operation, maybe an NSBatchUpdateRequest after you run your batch insert in this way,
or after the batch insert, you can update certain properties in a simple loop in a (possibly background/child) context without a batch operation, which could be fine if there isn't tons of data;
lastly, you might be able to first batch insert new data to a temporary store before somehow manually merging your choice of properties with the new store, then delete the temporary store.
A simpler approach: you could fetch the all properties you want to keep unchanged before you execute the batch insert (storing them in an dictionary keyed by your object's uniqueness constraint value), and then during the batch insert set the property again.
For this approach, you will want to use a different merge policy such as NSMergeByPropertyObjectTrumpMergePolicy so that the updated object gets re-inserted into the store (make sure to fetch all properties that you don't want to lose in advance of the batch insert)
random idea: How to Save Data When Using One ManagedObjectContext and PersistentStoreCoordinator with Two Stores
I don't think it is actually possible to do a partial update with a batch insert request. It's hard to know for sure because I don't think any of this is documented except in WWDC sessions. When I first watched the 2019 session, I was excited because the presenter said:
Attributes that are optional or configured with default values can be omitted from the dictionary as well.
In the case of updating an object with unique constraint, the existing values will not be changed.
I took this to mean that:
You can omit values for new objects, and you'll get the defaults or NULL. That makes sense.
If there's an existing object and you omit a value, that value will not the changed. So you can purposely omit values to do a partial update, i.e. update other values while leaving your isFavorite alone.
But, after writing code to test this and looking at the output from com.apple.CoreData.SQLDebug, what actually seems to happen with NSMergeByPropertyObjectTrumpMergePolicy is:
If you omit a value that's required you get a validation error.
If you omit a value that's optional, it updates the row to NULL. For a Bool property in Swift, this will become false.
If you omit a value with a default value, it updates the row to the default.
This is a shame because it seems like partial updates could be implemented by having the ON CONFLICT clause only specify DO UPDATE SET for the attributes that you actually set. But (as of macOS 11) Core Data seems to always generate SQL to set all of the columns.
In summary, with batch inserts, NSMergeByPropertyObjectTrumpMergePolicy does not actually merge by property based on what's changed (like with a regular Core Data save). Rather, it either inserts a new row (if the object is absent) or overwrites all the columns but preserves the objectID (if the object was present).
NSMergeByPropertyStoreTrumpMergePolicy also doesn't merge by property. It just means to leave the stored object alone if it's already present.
Update (2021-06-24): I heard from DTS that Apple considers the current (iOS 14/macOS 11) behavior described above a bug, and that it should let you batch insert without changing omitted properties. The Radar number is 79747419.

IsConcurrencyToken(true) does not actually check concurrency tokens

I have a property configured like this:
public byte[] Timestamp { get; set; }
And then in my DbContext, i use the Fluent API like so:
modelBuilder.Entity<MyClass>()
.Property(x => x.Timestamp)
.IsRowVersion()
.IsConcurrencyToken(true);
So naturally i went ahead and wrote a unit test, ensuring that an entity with a wrong timestamp set would not get saved. I used Sqlite and some custom Sql to make RowVersion work in Unit Tests but to my surprise i never got an exception. Then, i tested it in our application, and i also did not get an exception when a wrong Timestamp was set on the entity.
var myInstance = await myDbContext.Instances
.Include(x => x...)
.Include(x => x...)
.SingleAsync(x => x.Id == id);
// set other values, add new entities to relationships, aso
myInstance.Timestamp = new byte[] { 1, 2, 3, 4 };
await myDbContext.SaveChangesAsync();
I am clearly missing something here. I thought configuring IsRowVersion would be enough to force EF Core to include a WHERE Timestamp = clause in the UPDATE but it seems like thats not the case. As you can see, i also tried calling IsConcurrencyToken (even with its default value of true, just to be sure) but to no avail.
Edit: I have worked "around" it now by including the Timestamp in my SingleAsync call, but this still leaves me unsure if its still possible to not get a concurrency exception, as the Timestamp set on my entity is apparently not checked at all when saving?
This is a known behavior of EF core, as documented here:
https://github.com/aspnet/EntityFrameworkCore/issues/18505
Manually changing the value of the token is considered to be a no-op, as the original value that was queried from the database is being used for a concurrency check. Manually changing the value does nothing, as it is effectively being ignored.
It is possible to manually set the token's value and have EF Core's optimistic concurrency check fail if the database has a newer version of the entity:
db.Entry(entity).Property(nameof(Entity.ConcurrencyToken)).OriginalValue = entity.ConcurrencyToken;
As per this comment in #user604613's answer

How to load document out of database instead of memory

Using Raven client and server #30155. I'm basically doing the following in a controller:
public ActionResult Update(string id, EditModel model)
{
var store = provider.StartTransaction(false);
var document = store.Load<T>(id);
model.UpdateEntity(document) // overwrite document property values with those of edit model.
document.Update(store); // tell document to update itself if it passes some conflict checking
}
Then in document.Update, I try do this:
var old = store.Load<T>(this.Id);
if (old.Date != this.Date)
{
// Resolve conflicts that occur by moving document period
}
store.Update(this);
Now, I run into the problem that old gets loaded out of memory instead of the database and already contains the updated values. Thus, it never goes into the conflict check.
I tried working around the problem by changing the Controller.Update method into:
public ActionResult Update(string id, EditModel model)
{
var store = provider.StartTransaction(false);
var document = store.Load<T>(id);
store.Dispose();
model.UpdateEntity(document) // overwrite document property values with those of edit model.
store = provider.StartTransaction(false);
document.Update(store); // tell document to update itself if it passes some conflict checking
}
This results in me getting a Raven.Client.Exceptions.NonUniqueObjectException with the text: Attempted to associate a different object with id
Now, the questions:
Why would Raven care if I try and associate a new object with the id as long as the new object carries the proper e-tag and type?
Is it possible to load a document in its database state (overriding default behavior to fetch document from memory if it exists there)?
What is a good solution to getting the document.Update() to work (preferably without having to pass the old object along)?
Why would Raven care if I try and associate a new object with the id as long as the new object carries the proper e-tag and type?
RavenDB leans on being able to serve the documents from memory (which is faster). By checking for persisting objects for the same id, hard to debug errors are prevented.
EDIT: See comment of Rayen below. If you enable concurrency checking / provide etag in the Store, you can bypass the error.
Is it possible to load a document in its database state (overriding default behavior to fetch document from memory if it exists there)?
Apparantly not.
What is a good solution to getting the document.Update() to work (preferably without having to pass the old object along)?
I went with refactoring the document.Update method to also have an optional parameter to receive the old date period, since #1 and #2 don't seem possible.
RavenDB supports optimistic concurrency out of the box. The only thing you need to do is to call it.
session.Advanced.UseOptimisticConcurrency = true;
See:
http://ravendb.net/docs/article-page/3.5/Csharp/client-api/session/configuration/how-to-enable-optimistic-concurrency

Can I use NSBatchDeleteRequest on entities with relationships that have delete rules?

I'm attempting to use NSBatchDeleteRequest to delete a pile of entities, many of these entities have delete cascade and/or nullify rules.
My first attempt to delete anything fails and the NSError I get back includes the string "Delete rule is not supported for batch deletes". I had thought it was fine to delete such things but i was responsible for making sure all the constraints are satisfied before I do a save.
Should I be able to batch delete these managed objects? (I want to keep the delete rules, other delete paths don't have an easy way to know what set of objects to delete) Do some kinds of batch deletes work in this case, but others not? (say predicates fail, but a list of object IDs work?)
Batch delete is problematic with relationships.
It goes directly to the database and deletes the records suspending all object graph rules, including the delete rules. You have correctly identified the requirement that you need to do all the constraint checking yourself again. (That by itself could be a deal-breaker.)
Even if you manage to delete the entities and all the necessary related entities correctly, you will still be left with lots of entries in the (opaque) join table Core Data creates in the background. There is no obvious safe way to delete the entries in the join tables and they have been reported to interfere with managing relationships in future operations.
IMO , the solution in this case is to still use the object graph rather than batch delete and optimize for performance. There are many good answers on SOF on how to do this, but most of it can be summarized with these points:
find the right batch size for saving (typically 500 entities for creation, about 2000 for deletion, but this could vary according to object size and relationship complexity - you have to experiment).
if you have memory constraints, use autoreleasepools.
use a background context to free the UI for interaction. I prefer to do the saving to the database in the background after updating the UI.
I just wrote a simple Department-Employee (one-to-many) demo project. The delete rule of Empolyee's department relationship is set to cascade.
When using batch deletes to delete a department with two employees, the number of deleted objects is only 1. So for the time being, batch deletes disregard delete rules.
You can try it for your self:
func deleteDepartment(named name: String) {
let fetch = NSFetchRequest<NSFetchRequestResult>(entityName: "Department")
fetch.predicate = NSPredicate(format: "name = %#", name)
let req = NSBatchDeleteRequest(fetchRequest: fetch)
req.resultType = .resultTypeCount
do {
let result = try self.persistentContainer.viewContext.execute(req)
as? NSBatchDeleteResult
print(result?.result as! Int) // number of objects deleted
} catch {
fatalError("Error!!!!")
}
}
If anyone would need this:
You can use two NSBatchDeleteRequest for parent and child entities.
let childFetchRequest: NSFetchRequest<NSFetchRequestResult> = NSFetchRequest(entityName: "ChildEntityName")
let childDeleteRequest = NSBatchDeleteRequest(fetchRequest: childFetchRequest)
do {
try persistenceService.context().execute(childDeleteRequest)
let parentFetchRequest: NSFetchRequest<NSFetchRequestResult> = NSFetchRequest(entityName: "ParentEntityName")
let parentDeleteRequest = NSBatchDeleteRequest(fetchRequest: parentFetchRequest)
do {
try persistenceService.context().execute(parentDeleteRequest)
persistenceService.saveContext()
/// handle success
} catch {
persistenceService.context().reset() // for example
/// handle error
}
}catch {
/// handle error
}

Azure table entity existence/synchronisation

I'm using an azure table query to retrieve all error entities assigned to a user.
Afther that I change a property of the entity to state that the entity is in processing mode.
After I have processed the entity I remove the entity from the table.
When I do parallel tests it can happen that during the query, an entity was already processed and deleted by another thread. So I get the error 404 ResourceNotFound when I want to Replace the entity.
Is there a way to test, if the entity was changed outside of the thread or if it still exists? Is it better to catch error 404 and ignore it or should I query for the entity again (seems all not right for me)?
TableQuery<ErrorObjectTableEntity> query = new TableQuery<ErrorObjectTableEntity>().Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, user));
List<ErrorObjectTableEntity> queryResult = table.ExecuteQuery(query).OrderBy(x => x.action).ToList();
foreach (ErrorObjectTableEntity entity in queryResult)
{
entity.inProcess = true;
try
{
TableOperation updateOperation = TableOperation.Replace(entity);
table.Execute(updateOperation);
}
catch
{
//..some logging here
//catch error 404?
}
//do some action
try
{
TableOperation deleteOperation = TableOperation.Delete(entity);
table.Execute(deleteOperation);
}
catch{...}
}
There are a couple of issues here as far as best practice. Your code as written could simply ignore the exception assuming another worker removed it but this could end up masking other classes of errors. One solution would be to use Queues to insert messages per user query, and then have various workers retrieve a message and process the query for a specific user. This way if a node goes down the app would absorb the fault and continue on. Additionally, this would keep your workers from duplicating work which would optimize the entire application. Lastly, if you don't care about the state of the entity and the keys are predictable you can use the Merge semantic to simply update a given property of an Entity without replacing the entire thing.
You should just catch the 404 error. Although they're represented as exceptions in .NET, HTTP 4xx error codes are more informational than exceptional. (5xx error codes are exceptional.)
Even if you checked that the entity existed before doing the replace, you would still need to catch the NotFound error in case it had been deleted between the check and the replace call. So you might as well skip the check.

Resources