Using Core Data in an asynchronous operation

Using Core Data in an asynchronous operation - multithreading

I have an iOS sync process broken down into a series of asynchronous NSOperation subclasses. These are broken down into those that do heavy processing and those that rely on networking. But most of them also do things with Core Data.
I'm not sure how to perform Core Data operations within the operation.
Here's a couple trivial examples… my real code does several switches in and out of the database context. It also uses #synchronize(self){}.
NSManagedContext *context = [self newContextFromParent];
__block NSString *someValue;
[context performBlockAndWait:^{
// fetch someValue from Core Data
}];
[self doMoreWorkWithValue:someValue];
[context performBlockAndWait:^{
NSError *e;
if ([context hasChanges]) {
[context saveChanges:&e];
}
}];
This seems on the surface like a good approach, but depending on what I do in performBlockAndWait: there are potential deadlocks here.
Generally, I like to avoid performBlockAndWait: in my code and use performBlock: instead.
[context performBlock:^{
NSString *someValue = #""; // fetch someValue from Core Data
[backgroundQueue addOperationWithBlock:^{
[self doMoreWorkWithValue:someValue withCompletion:^{
[context performBlock:^{
NSError *e;
if ([context hasChanges]) {
[context saveChanges:&e];
}
}];
}];
}];
}];
With this approach, though, I've moved my processing from the thread I was given to whatever thread backgroundQueue decides to run my process on, and I'm not sure what the better approach is.
If I capture [NSOperation currentQueue] in the main operation and add to it instead, I've added my block to the end of the queue. What I really want is to resume.
What approach should I be using here?

Your first approach is the one I would use. You mentioned concern about deadlocks inside performBlockAndWait:, though. Are you concerned about calling into another method that might itself use performBlockAndWait:? If so, no worries; performBlockAndWait: is explicitly safe to use re-entrantly. That is: this code is safe (though obviously contrived):
[context performBlockAndWait:^{
[context performBlockAndWait:^{
// fetch someValue from Core Data
}];
}];
If the deadlock concern is not related to Core Data, then it seems like you'd be at just as much of a risk of deadlock inside doMoreWorkWithValue:, right?

Related

Non-ARC Project: NSMutableArray, NSString Memory Leak

Our company team works on an existing application. That project is non-ARC (automatic reference counting). There is doubt occurs about release the object following code.
Code 1: Why is there no crash when I execute this code?
NSMutableArray *arraytest=[[NSMutableArray alloc]init];
for(int i=0;i<100;i++)
{
NSString *str=[NSString stringWithFormat:#"string:%d",i];
[arraytest addObject:str];
}
NSLog(#"arraytest before:%#",arraytest);
[arraytest release];
NSLog(#"arraytest after:%#",arraytest);
Similar code: with mutable copy
Code 2: After the changes the following code crashes at the last line.
NSMutableArray *arraytest=[[NSMutableArray alloc]init];
for(int i=0;i<100;i++)
{
NSString *str=[NSString stringWithFormat:#"string:%d",i];
[arraytest addObject:str];
}
NSLog(#"arraytest before:%#",arraytest);
NSMutableArray *copyarray=[arraytest mutableCopy];
[arraytest release];
NSLog(#"copyarray:%#",copyarray);
NSLog(#"arraytest after:%#",arraytest);
Why is there a memory leak in this line?
And why is there a memory leak in this line?
What is the correct method to execute the above code without memory leaks? Our company guys tells autorelease should not be used above code.

Consider your first example:
NSMutableArray *arraytest = [[NSMutableArray alloc] init];
// `arraytest` populated ...
NSLog(#"arraytest before:%#", arraytest);
[arraytest release];
NSLog(#"arraytest after:%#", arraytest); // DANGER: referencing dangling pointer!!!
That last NSLog statement is exceedingly dangerous, because after having called [arraytest release] (reducing the array's retain count from +1 to 0), the object pointed to by arraytest will have been deallocated and arraytest is now a dangling pointer to that deallocated memory. You should never reference a pointer after it's been deallocated. Sometimes it may look like you can use it, but it's not safe, and your app can now unexpectedly crash. (If you used zombies, though, it would have safely warned you of your attempt to incorrectly use this dangling pointer.)
Consider your second example:
NSMutableArray *arraytest = [[NSMutableArray alloc] init];
// `arraytest` populated ...
NSLog(#"arraytest before:%#",arraytest);
NSMutableArray *copyarray = [arraytest mutableCopy];
[arraytest release];
NSLog(#"copyarray:%#", copyarray);
NSLog(#"arraytest after:%#", arraytest); // DANGER: referencing dangling pointer!!!
In this example, you still have that very dangerous NSLog of arraytest at the end, after you've released it, and your use of that dangling pointer could easily crash. So you'd want to get rid of that.
But you now have introduced a leak. While you've released the object that arraytest originally pointed to, you have not released the object that copyarray points to, the result of the mutableCopy. Thus, this new copyarray instance will leak. And thus, all those strings that were originally allocated when you created arraytest will now be referenced by the leaked copyarray and they'll leak, too.
If you added a [copyarray release] to the end of this routine, both the leaking array and the leaking strings would have been resolved.
Now, consider your third example, shown only in the final Instruments screen snapshot:
NSMutableArray *arraytest = [[NSMutableArray alloc] init];
for (int i=0; i<100; i++) {
NSString *str = [NSString stringWithFormat:#"string:%d", i];
[arraytest addObject:str];
[str release]; // DANGER: released `str` whose ownership was never transferred to you!!!
}
NSLog(#"arraytest before:%#",arraytest);
NSMutableArray *copyarray=[arraytest mutableCopy];
[arraytest release];
NSLog(#"copyarray:%#",copyarray);
NSLog(#"arraytest after:%#",arraytest); // DANGER: referencing dangling pointer!!!
In this final example, we are compounding the problem, by overreleasing the string that you just added to the array. So you created an autorelease object (effectively the retain count will be zero when the autorelease pool is drained), added it to the array (increasing its effective retain count to +1), and released the str (reducing it's retain count back to +0 upon draining of the pool).
The app is now in an unstable situation, because the array is now referencing objects that could be released when the autorelease pool is drained, ending up with an array of dangling pointers. Worse, if this array were ever properly released itself, all of those strings would be overreleased.
And, of course, you still have the leaking of the array, as discussed in the second example, above. But if you did properly release this copyarray, all of those strings would be overreleased.
Probably needless to say at this point, but the way to eliminate this leak is to simply release the copyarray:
NSMutableArray *arraytest = [[NSMutableArray alloc] init];
for (int i=0; i<100; i++) {
NSString *str = [NSString stringWithFormat:#"string:%d", i];
[arraytest addObject:str];
}
NSLog(#"arraytest before:%#", arraytest);
NSMutableArray *copyarray = [arraytest mutableCopy];
[arraytest release];
NSLog(#"copyarray:%#", copyarray);
[copyarray release];
This follows the Basic Memory Management Rules, namely that you are responsible for calling release on those objects that you own by virtue of having received them from a method starting with alloc, new, copy, or mutableCopy).
A couple of closing observations:
If you're going to use manual reference counting, I'd suggest that you make frequent use of Xcode's static analyzer (shift+command+B, or "Analyze" on Xcode's "Product" menu). It's surprising good at identifying manual reference counting memory problems. Instruments is useful, but as shown by our third example above, you can easily be drawn to incorrect conclusions (e.g. "gee, I need to release all those strings"). The static analyzer would have pointed out some of these problems for you.
Bottom line, always make sure you have a clean bill of health from the static analyzer before proceeding further. There is no point in trying to reverse engineer what problems may be manifested in Instruments when the analyzer could have told you precisely what the issue was.
I would advise not drawing any conclusions from the fact that a particular dangling pointer didn't crash your app, but another did. It's just not predictable. If you turn on zombies (only temporarily, for development/testing purposes, not for production apps), that will bring your attention to any attempts to reference a previously deallocated object.
I notice that you're using NSString in your tests. You should be aware that NSString has internal memory optimizations that can yield non-standard behavior. I'd be wary of using NSString in these sorts of experiments.
Don't get me wrong: If you follow all of the Basic Memory Management Rules, NSString will behave properly. But if you're trying to examine what sort of errors/crashes result when deliberately not following those memory management rules, be aware that NSString can be misleading.
Needless to say, using ARC will greatly simplify your life. See the Transitioning to ARC Release Notes for more information.

Does it help to have a try-catch around NSManagedObjectContext save?

I make some non-essential changes to the database in a background thread (shortly after app launch), and then merge them into the main context. The background thread can end up making a lot of changes, but I don't want the context save to trip up over some validation errors or some inscrutable Core Data exception in this background processing; especially since I use iCloud with Core Data, users can end up with nilled-out relationships and what not. I just want the app to keep running instead of throwing an exception and quitting.
In this case, does it make sense to have a #try-#catch block around the context save? Are there any performance or memory management issues with doing this?
Something like this:
#try {
[context performBlockAndWait: ^{
NSError *error = nil;
if ([context save:&error]){
NSLog(#"Child context saved");
[context.parentContext performBlockAndWait:^{
NSError *parentError = nil;
if ([context.parentContext save: &parentError]){
NSLog(#"Parent context saved");
}
}];
}
}];
} ....
My app ships to thousands of customers, so it would be great to know before-hand if this could cause more problems than it solves.

What exceptions are being thrown?
Since -[NSManagedObjectContext save:] uses the NSError out-parameter pattern, I would generally expect it NOT to throw. However, the general pattern in Cocoa is that exceptions are "death" and not considered recoverable.
There are places in various system frameworks that throw and catch exceptions (which you can see by setting an exception-throw breakpoint in the debugger) -- I'm looking at you Cocoa bindings -- but generally speaking if an exception bubbles up to your app's code, you're already "dead in the water."
Are there any performance or memory management issues with doing this?
These days, the performance penalty for a #try/#catch/#finally is pretty minimal (this wasn't always the case). There are memory management implications to be sure (which is probably why exceptions are generally "death" on this platform.) If you're using ARC and exit a scope by way of an exception being thrown, retains taken by ARC are not released. As described here:
The standard Cocoa convention is that exceptions signal programmer
error and are not intended to be recovered from. Making code
exceptions-safe by default would impose severe runtime and code size
penalties on code that typically does not actually care about
exceptions safety. Therefore, ARC-generated code leaks by default on
exceptions, which is just fine if the process is going to be
immediately terminated anyway. Programs which do care about recovering
from exceptions should enable the option.
In short, there's probably no point in wrapping a context save operation in a #try/#catch block.

Can I use NSManagedObjectContext with NSPrivateQueueConcurrencyType in a concurrent GCD queue

My approach so far has been something like this:
1- A main context initialized like so:
_persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:self.model];
if(![_persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:storeURL options:nil error:&error]) {
DDLogModel(#"Unresolved error %#", error.localizedDescription);
return;
}
self.context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
self.context.persistentStoreCoordinator =_persistentStoreCoordinator;
2- Then, as I go about creating core data objects or modifying their relationships concurrently:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSManagedObjectContext *tempContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSConfinementConcurrencyType];
tempContext.persistentStoreCoordinator = self.persistentStoreCoordinator;
// Do stuff
[tempContext save:nil];
});
3- And finally, the main context merge via the NSManagedObjectContextDidSaveNotification
However, I've recently seen a different approach where for step 2, they instead create a context with NSPrivateQueueConcurrencyType, make it a child of the main context, and do any work by means of -performBlock:
Is this last approach concurrent by default (even without explicitly dispatching it as such)? Or what's the advantage over the approach I explained?
Another thing that threw me off is that even though the contexts have parentContext and persistentStoreCoordinator properties, it appears that setting the latter implies one cannot set the former. That is, a context with a persistent store coordinator actually has that store coordinator as it's parent context?
UPDATE:
Another interesting thing is that with the approach I described above (using GCD), everynow and then when I do [tempContext save:] I get a weird behaviour: No error is returned (assuming I do pass in an NSError object, unlike in the example), but if I set the generic Objective-C exception pointer on, the background thread does stop there, as if there was an exception. However, if I continue the app does not crash and keeps going and the main moc seems to be just fine.

You are right. performBlock will automatically perform the work in the background. In some cases it might make sense to use the current thread (main or background), in which case you can use performBlockAndWait. Using child contexts is the recommended approach.
I suppose your setup could work just as well. I guess the advantage of using a child context lies in a more structured approach of saving, i.e. "pushing up" saves into the parent context. Only the parent context will actually touch the persistent store, so this is better in terms of thread safety.
Your last question is not clear. A context can only have a context as its parent context, not a persistent store coordinator. However, what might be confusing you is that prior to iOS 5 there was only a "Parent Store", and with the introduction of child contexts it could be replaced optionally by a parent context. Read all about it here.

Core Data Single Managed Object Context and two threads

I am going crazy trying to figure this out. I am working on an application that is syncing up data from the webserver. There is a background thread that is pulling data from the server to the application. At the same time I am making changes to the UI. The values changed on UI are being saved to core data in foreground.
Through out the application I have one managedObjectContext that I fetch from the app delegate every time I create a fetchController . App delegate code
- (NSManagedObjectContext *)managedObjectContext
{
if (__managedObjectContext != nil) {
return __managedObjectContext;
}
NSPersistentStoreCoordinator *coordinator = [self persistentStoreCoordinator];
if (coordinator != nil) {
__managedObjectContext = [[NSManagedObjectContext alloc] init];
[__managedObjectContext setPersistentStoreCoordinator:coordinator];
}
return __managedObjectContext;
}
Now the problem is I am getting error while trying to save the context. The errors are happening randomly in the code. I am saving the context as soon as I am making change to any entity. Also I have two relationships each in each entity one to its child that is one to many and one to its parents that is to - one. All relationship have appropriate inverse.
I think I am doing something conceptually wrong over here by mentaining one context. Could you please advice how I should manage context in a situation where both background and foreground threads are reading and writing to the coredata. Thanks.

Managed object contexts are not thread safe, so if you use the same one on more than one thread without considering concurrency-- you're going to have major problems. As in, crashing and/or data loss and maybe even data corruption. There are a couple of ways to deal with this:
Use one of the queue concurrency types when creating the context-- see docs for initWithConcurrencyType:. Then, whenever you access the data store, use either performBlock: or performBlockAndWait: to synchronize access.
Create a new managed object context for the background thread. Use NSManagedObjectContextDidSaveNotification and mergeChangesFromContextDidSaveNotification: to keep multiple contexts synchronized.
But whatever you do, do not just use one managed object context on more than one thread.

How to minimize the costs for allocating and initializing an NSDateFormatter?

I noticed that using an NSDateFormatter can be quite costly. I figured out that allocating and initializing the object already consumes a lot of time.
Further, it seems that using an NSDateFormatter in multiple threads increases the costs. Can there be a blocking where the threads have to wait for each other?
I created a small test application to illustrate the problem. Please check it out.
http://github.com/johnjohndoe/TestNSDateFormatter
git://github.com/johnjohndoe/TestNSDateFormatter.git
What is the reason for such costs and how can I improve the usage?
17.12. - To update my observation: I do not understand why the threads run longer when processed parallel compared to when the run in serial order. The time difference only occurs when NSDateFormatter is used.

Note: Your example program is very much a micro-benchmark and very effectively maximally amplifies that cost of a date formatter. You are comparing doing absolutely nothing with doing something. Thus, whatever that something is, it will appear to be something times slower than nothing.
Such tests are extremely valuable and extremely misleading. Micro-benchmarks are generally only useful when you have a real world case of Teh Slow. If you were to make this benchmark 10x faster (which, in fact, you probably could with what I suggest below) but the real world case is only 1% of overall CPU time used in your app, the end result is not going to be a dramatic speed improvement -- it will be barely noticeable.
What is the reason for such costs?
NSDateFormatter* dateFormatter = [[NSDateFormatter alloc] init];
[dateFormatter setDateFormat:#"yyyyMMdd HH:mm:ss.SSS"];
Most likely, the cost is associated with both having to parse/validate the date format string and having to do any kind of locale specific goop that NSDateFormatter does. Cocoa has extremely thorough support for localization, but that support comes at a cost of complexity.
Seeing as how you wrote a rather awesome example program, you could fire up your app in Instruments and try the various CPU sampling instruments to both understand what is consuming CPU cycles and how Instruments works (if you find anything interesting, please update your question!).
Can there be a blocking where the threads have to wait for each other?
I'm surprised it doesn't simply crash when you use a single formatter from multiple threads. NSDateFormatter doesn't specifically mention that it is thread safe. Thus, you must assume that it is not thread safe.
How can I improve the usage?
Don't create so many date formatters!
Either keep one around for a batch of operations and then get rid of it or, if you use 'em all the time, create one at the beginning of your app's run and keep around until the format changes.
For threading, keep one per thread around, if you really really have to (I'd bet that is excessive -- that the architecture of your app is such that creating one per batch of operations will be more sensible).

I like to use a GCD sequential queue for ensuring thread safety, it's convenient, effective, and efficient. Something like:
dispatch_queue_t formatterQueue = dispatch_queue_create("formatter queue", NULL);
NSDateFormatter *dateFormatter;
// ...
- (NSDate *)dateFromString:(NSString *)string
{
__block NSDate *date = nil;
dispatch_sync(formatterQueue, ^{
date = [dateFormatter dateFromString:string];
});
return date;
}

Using -initWithDateFormat:allowNaturalLanguage: instead of -init followed by -setDateFormat: should be much faster (probably ~2x).
In general though, what bbum said: cache your date formatters for hot code.
(Edit: this is no longer true in iOS 6/OSX 10.8, they should all be equally fast now)

Use GDC dispath_once and you're good. This will ensure syncing between multiple threads and ensure that the date formatter is only created once.
+ (NSDateFormatter *)ISO8601DateFormatter {
static NSDateFormatter *formatter;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
formatter = [[NSDateFormatter alloc] init];
formatter.dateFormat = #"yyyy-MM-dd'T'HH:mm:ssZ";
});
return formatter;
}

Since the creation/init of NSDateFormatter AND the format and locale changes costs a lot. I've created a "factory" class to handle the reuse of my NSDateFormatters.
I have a NSCache instance where I store up to 15 NSDateFormatter instances, based on format and locale info, in the moment when I created then. So, sometime later when I need them again, I ask my class by some NSDateFormatter of format "dd/MM/yyyy" using locale "pt-BR" and my class give the correspondent already loaded NSDateFormatter instance.
You should agree that it's an edge case to have more than 15 date formats per runtime in most standard applications, so I assume this is a great limit to cache them. If you use only 1 or 2 different date formats, you'll have only this number of loaded NSDateFormatter instances. Sounds good for my needs.
If you would like to try it, I made it public on GitHub.

I think the best implementation is like below:
NSMutableDictionary *threadDictionary = [[NSThread currentThread] threadDictionary];
NSDateFormatter *dateFormatter = threadDictionary[#”mydateformatter”];
if(!dateFormatter){
#synchronized(self){
if(!dateFormatter){
dateFormatter = [[NSDateFormatter alloc] init];
[dateFormatter setDateFormat:#”yyyy-MM-dd HH:mm:ss”];
[dateFormatter setTimeZone:[NSTimeZone timeZoneWithName:#”Asia/Shanghai”]];
threadDictionary[#”mydateformatter”] = dateFormatter;
}
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string